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Abstract 

I discuss various formulations of stochastic Einstein locality (SEL), which is a 
version of the idea of relativistic causality, i.e. the idea that influences propagate 
at most as fast as light. SEL is similar to Reichenbach's Principle of the Common 
Cause (PCC), and Bell's Local Causality. 

My main aim is to discuss formulations of SEL for a fixed background space- 
time. I previously argued that SEL is violated by the outcome dependence shown 
by Bell correlations, both in quantum mechanics and in quantum field theory. 
Here I re-assess those verdicts in the light of some recent literature which argues 
that outcome dependence does not violate the PCC. I argue that the verdicts 
about SEL still stand. 

Finally, I briefiy discuss how to formulate relativistic causality if there is no 
fixed background spacetime. 
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1 Introduction 



This paper addresses what the the EPR-Bell correlations imply about relativistic 
causality, i.e. the requirement that causal influences (signals) can travel at most as 
fast as light, within quantum theories. The overall message is that the situation is 
subtle: various precise formulations of relativistic causality are satisfied, and others 
violated, and (more interestingly) that there remain plenty of open questions0 

Until the last Section of the paper, I discuss stochastic relativistic causality for 
a fixed relativistic spacetime, in which probabilistic events occur. The spacetime is 
fixed, in the sense that these events' outcomes have no influence on the structure of 
the spacetime. In this context, the literature considers many different formulations 
of stochastic relativistic causality; so I shall have to be very selective. I shall focus 
on some formulations of 'stochastic Einstein locality' (SEL): an idea which was first 
proposed by Hellman (1982) for the special case of Minkowski spacetime as the fixed 
background. There are three reasons for this focus. 

(i) : SEL is a natural expression of the rough idea that causal influences cannot 
travel faster than light. Indeed, it is so natural that I think it ought to be as familiar to 
philosophers of probability and causation as is Reichenbach's Principle of the Common 
Cause (FCC). We will see, already in Section [21 similarities between some formulations 
of SEL and PCC; and between SEL and Bell's condition, familiar to physicists, which 
he called 'local causality'. 

(ii) : Assessing how both SEL and PCC fare in the Bell experiment leads to some 
surprises for the folklore that the Bell experiment refutes PCC. There are two points 
here. First: broadly speaking, the arguments for the folklore can be adapted so as to 
provide arguments that some natural formulations of SEL are violated by the experi- 
ment. So if we want to express precisely what is "spooky" about Bell correlations, the 
violation of SEL is a good candidate. Second: recent work by what I shall call 'the 

^Another paper (Butterfield 2007) focusscs instead on less well-known "single particle" violations 
of relativistic causality. It also locates the discussion within the general philosophy of causation, 
especially the causal anti-fundamentalism of Norton (2003, 2006). 







Budapest school' has objected to the arguments for the folklore, on the grounds that 
the arguments assume different correlations should have the same common cause (now 
often called a 'common common cause'). The Budapest school backs up this objection 
with theorems to the effect that natural formulations of PCC are obeyed by the Bell 
correlations; (they have such theorems for both elementary quantum mechanics, and 
quantum field theory). So someone interested in SEL needs to address the question 
how this objection, and these theorems, bear on SEL. 

(iii): The third reason is personal. In previous work (1989, 1992, 1994, 1996), I 
endorsed the folklore that Bell correlations refute PCC, and also argued that they 
refute SEL. (Fortunately, I will need very little repetition from that earlier work; nor 
need I presuppose it.) So I have a responsibility to consider the Budapest school's 
objection and theorems. 

There will be two main morals. The first is that there are various formulations 
of relativistic causality, indeed of SEL. It is of course no surprise that formulations 
will vary if one considers different conceptions of events and causal influences. But 
we will see, more surprisingly, that even within a fixed conception, the intuitive idea 
of SEL can be made precise in inequivalent ways. (But as we would probably expect: 
the richer the conception of events, influences etc., the more inequivalent formulations 
there are.) 

In the two main Sections ([2] and [3]), I shall consider a philosophical conception of 
events and influences, close to Hellman's original one (1982). Section [2] develops three 
natural formulations of SEL; and some conditions under which they are equivalent. 
In Section [3], I apply these formulations of SEL to the Bell experiment; and so turn 
to topics (ii) and (iii) above. I admit that the situation is not clear-cut: one has 
to make some judgments about how to apply the formulations of SEL to the experi- 
ment, and about setting aside various loopholes e.g. about detector efficiencies. But 
given such judgments, one can ask whether the Bell correlations (specifically: outcome 
dependence) refute SEL, or PCC. 

This yields my second, more specific, moral: despite the work of the Budapest 
school, a good case can be made that Yes, the correlations do refute PCC and SEL. 
Indeed, for PCC in quantum mechanics, the case has already been made. There are two 
main points here. First: for the Bell experiment, it is reasonable to postulate a common 
common cause — as the arguments for the folklore blithely did; (so far as I know, Placek 
first argued for this). Second: though the Budapest school is right that Bell's theorems 
traditionally assumed a common common cause, there are recent theorems (especially 
by what I shall call 'the Bern school') assuming only separate common causes for the 
various correlations. So I will report these two points, and then adapt them to SEL. 
(In case this moral sounds defensive, let me confess at the outset that when in previous 
work I endorsed the folklore, I was blithely unaware of my interpreting PCC strongly, 
viz. as requiring a common common cause: (as, I suspect, were some other folk). So: 
mea culpa, and all credit to the Budapest school for emphasising the issue.) 

In the last two Sections, I briefly discuss how SEL fares beyond quantum mechanics: 



in algebraic quantum field theory (AQFT: Section H]); and in theories with an unfixed, 
i.e. dynamical, spacetime (Section AQFT provides a richer and more precise 
conception of events, influences and probabilities than the philosophical conception of 
Sections [2] and [31 As one would expect, this richer conception provides many more 
conditions expressing the broad idea of relativistic causality. But SEL and PCC can 
each be applied in a clear-cut way to this conception. The situation is then broadly as 
it was in Section [31 That is: 

(a) there are inequivalent formulations of SEL and PCC; 

(b) the Budapest school shows that a natural formulation of PCC is obeyed by the 
Bell correlations; but also; 

(c) I argue that some natural formulations of SEL are violated]! 

Finally, Section[5lturns briefly to stochastic relativistic causality in dynamical space- 
times. This is a much less well-developed field: in particular, we will see, already in 
Sections [21 [31 and [H ways in which our formulations of SEL presuppose a fixed space- 
time. For reasons of space, I make just two comments by way of advertising some open 
problems. I report an experimental test of whether SEL applies to metric structure, 
proposed by Kent; and briefly discuss how SEL fails trivially in the causal set approach 
to quantum gravity of Sorkin and others. 

2 Formulating Stochastic Einstein Locality 

After some preliminaries about events and regions (Section 12. ip . I will present the idea 
of SEL (Section 12.21) . give three precise formulations of it (Section 12. 3p and discuss 
implications between them (Section 12.41) . The discussion will be informal and the 
proofs elementary. 

2.1 Events and regions 

Imagine we are given a spacetime Ai in regions of which stochastic events occur. 
Since closed causal curves notoriously impose consistency conditions on initial data 
assigned to a spacelike hypersurface, which can be so severe as to veto the variety 
of outcomes associated with a stochastic process, it will be best to assume that Ai 
has no closed causal curves. Indeed, in this Section we shall want to appeal to some 
spacetime notions, such as spacelike hypersurfaces, and some properties of causal "good 
behaviour" enjoyed by some spacetimes, without going into technicalities about these 

bibliographic note about how Sections [5] to [D add to and correct my previous work; more details 
in the sequel. Section[2ls material adds to the discussion in my (1994, Section 5). Section [3] endorses 
some main claims of my (1989, Section 2f.; 1992, pp. 74-77; 1994, Sections 6, 7) but corrects others, in 
the light of the Budapest school's work. As to AQFT, discussion of SEL in this setting was initiated 
by Redei (1991), and later work includes MuUer and Butterficld (1994). Section [4] will endorse my 
(1996)'s claim that a formulation of SEL is violated by AQFT. and relate this to Redei and Summers' 
(2002) theorem that a formulation of PCC is obeyed by AQFT. 
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notions and properties. So for simplicity, I will assume from the outset that A4 has the 
strong good-behaviour property, stable causality. I shall not need the exact definition 
of this; (cf. e.g. Hawking and Ellis 1973, p. 198; Geroch and Horowitz 1979, p. 241; 
Wald 1984, p. 198). Suffice it to say that: 

(i) : The idea is that a spacetime is stably causal if not only does it lack closed 
timelike curves, but also the spacetime resulting from a slight opening out of the light- 
cones does not have any such curves. 

(ii) : Stable causality has various useful consequences. In particular, a spacetime is 
stably causal iff it has a "global time function", i.e. a smooth function f : Ai —>■ M. 
whose gradient is everywhere timelike. 

For later reference, I should also mention a much stronger good-behaviour notion: 
a spacetime is globally hyperbolic if it has a Cauchy surface. Here, a Cauchy surface 
S is a "global instantaneous slice" in that for every point p & M.: either p lies in S's 
future domain of dependence /^"^(E), i.e. every past-inextendible causal curve through 
p intersects S; or p lies in the past domain of dependence /^"(S), in the corresponding 
sense that every future-inextendible causal curve through p intersects S. If a spacetime 
is globally hyperbolic, then it is also stably causal; and the global time function / can 
be chosen so that each surface of constant / is a Cauchy surface; and M. then has 
the topology IR x S, where S denotes any Cauchy surface. Thus a globally hyperbolic 
spacetime can be foliated by Cauchy surfaces. (Cf. Hawking and Ellis 1973, p. 205-212; 
Geroch and Horowitz 1979, p. 252-253; Wald 1984, p. 201, 205, 209.) 

Since the events E,F, ... occurring in regions of A4 are to be stochastic, and influ- 
enced by prior events, it is natural to envisage a time-dependent probability function 
prt which assigns probabilities prt{E) etc; with the values of the probabilities reflecting 
how the events before t happen to have turned out, and so how the probability of E 
has waxed and waned. 

Since is a relativistic spacetime, we will take t as a hypersurface. Again, I shall 
not worry about the various technical meanings of 'hypersurface'. Suffice it to say that 
for the most part, t will be a spacelike hypersurface without an "edge", stretching right 
across Ai and dividing it in to two disjoint parts, the "past" and "future". However: 

(a) : Not all the hypersurfaces considered will be everywhere spacelike. For we 
will allow t to include parts of the boundaries of past light-cones. Such boundaries 
are always 3-dimensional embedded sub-manifolds of Ai which are "well-behaved" in 
being achronal, i.e. having no pairs of points p, q connected by a timelike curve (Wald 
1984, p. 192). 

(b) : And if t is spacelike, it need not be a Cauchy surface. In particular, for much 
of the discussion some weaker notion will suffice, e.g. a closed achronal set without 
edge (often called a 'shce': Wald 1984, p. 200). 

I shall also use history for the collection or conjunction of all events up to a given 
hypersurface (or within a given spacetime region). This usage will be informal, and 
in particular is not meant to have any of the technical connotations of the 'consistent 
histories' programme in quantum theory. So the probability distribution prt is meant 



to reflect how history happens to have turned out (and so modified the prospects of 
future events), up to the hypersurface t. 

There are two issues about the association of an event with a spacetime region: 
the first philosophical, the second technical. (Section S] will also have more to say 
about this association in AQFT.) Most philosophers (including Hellman and those 
listed in footnote 2) think of an event as a contingent matter of particular fact that 
is localized in a region, by being about properties of objects in that region (or about 
properties of the region itself) that are intrinsic to the region. Philosophers dispute 
how to analyse, and even how to understand, the intrinsic-extrinsic distinction among 
properties. But the intuitive idea is of course that an intrinsic property implies nothing 
about the environment of its instance. For example: 'he was fatally wounded at noon' 
does not just attribute the intrinsic property of being wounded at noon, since 'fatally' 
implies a later death. On the other hand, 'the average electromagnetic energy density 
in spacetime region R is a' no doubt does attribute an intrinsic property to R. 

I shall not need to be precise about 'intrinsic'; nor consequently, about how events 
are associated with regions. I shall even use E indifferently for an event and for the 
spacetime region with which it is associatedlfl So to sum up so far: the course of events, 
or history, up to a hypersurface t gives a probability prt{E) for an event E to occur in 
(report some intrinsic properties of) a certain region future to t. So intuitively, E is 
also a set or union or disjunction of other ways in which history could turn out in its 
region, or indeed in the rest of the future of t. 

The technical issue is whether we should require an event's spacetime region to 
have certain properties, e.g. being open, or bounded, or convex. In fact. Sections [2] 
and [3] will not need any such requirements; but in Section HJ AQFT will work with 
open bounded regions. But with the exception of one place in Section I2.4.1.2f s proof, 
we can for the sake of definiteness, assume throughout Sections |2] and [3] that events' 
regions are bounded and convex. (This simplifies drawing the diagrams corresponding 
to the different formulations of SEL.) 

A related issue is whether we should define the past light-cone of the region in which 
an event E occurs — which will be important for SEL — in terms of the chronological 
or the causal past. In fact, the convention is to define these in slightly non-parallel 
ways, as follows; (e.g. Hawking and Ellis 1973, p. 182; Geroch and Horowitz 1979, 
p. 232; Wald 1984, p. 190; of course, corresponding remarks hold for futures). The 
chronological past I~~{p) of a point p & Ai is the set of points connectible to p by a 

^But note that some authors' notations do distinguish events and their associated regions: for 
example, Henson (2005: Section 2.2.1) who writes dom(i?) for the region of event E, which he calls the 
'least domain of decidability of E\ Henson thinks of the association epistemically. He says dom(£') 
is the unique smallest region such that knowing all its properties enables us to decide whether E 
occurred. I would reply that this epistemic gloss obviously does not avoid the need to restrict the 
properties considered to be intrinsic; and that with this restriction, the epistemic gloss is unnecessary. 
But this disagreement does not affect the plausibility of Henson's axioms, which describe how the map 
dom from events to spacetime regions interacts with the Boolean operations on events and regions: 
nor therefore, the results he deduces, which concern his formulations of PCC. 



past-directed timelike curve; I~{p) is open, but since the curve must be non-zero, in 
general p ^ I~{p)- For a set C A^, we define I~{E) := Upg^; /"({p}), so that I~{E) 
is always open. On the other hand, the causal past J~{E) of a set E (where maybe 
E = {p}) is defined so as to include E: viz. as the union of E with the set of points 
connectible to a point of ii^ by a past-directed causal curve; {J~{E) need not be closed). 
Broadly speaking, it is usually easier to work with chronological pasts (as Geroch and 
Horowitz remark). But the results in Section et seq. are more easily obtained if 
any set E is contained in its past light-cone: which with the conventional definitions, 
suggests we should use J~{E). On the other hand, if E is open then E C I~{E). So in 
effect, we have a choice: either we use causal pasts, or we require E to be open and use 
chronological pasts. But again, I shall not need to be precise, and so need not choose. 
To signal this lack of commitment, I shall adopt the idiosyncratic notation, C~{E), 
for the past light-cone of E (or of the region in which an event E occurs). But if one 
wanted to be definite, one could adopt either of the above choices, 
implies SELDl? and even also convex? or that the tangent plane to 

2.2 The idea of SEL 

The idea of relativistic causality can now be expressed as 'Stochastic Einstein Locality' 
(SEL): For an event E occurring in the spacetime Ai, the probability at an earlier 
time (hypersurface) t that E occurs, prt{E), should be determined by history (i.e. the 
events that occurred) within that part of the past light-cone of E that lies before t; i.e. 
by history within C~{E) fl C~{t). Here we envisage that t: 

(i) is spacelike, if not everywhere then at least within C~{E); and 

(ii) cuts C~{E) into two disjoint parts, the "summit" C~^{t)r\C~{E) and the "base" 

c-{t)nc-{E). 

I shall say that a hypersurface t satisfying (i) and (ii) divides C^{E). 

So our present aim is to make more precise this idea of determination by the history 
within the past light-cone; and in doing this, we shall see some inequivalent formulations 
of the idea. (But we shall maintain the idea that t, even if not everywhere spacelike, 
divides C-{E).) 

To talk about the various possible total courses of events, it is natural to use 
philosophers' jargon of 'possible worlds'. A possible world corresponds in physicists' 
jargon to a 'solution (throughout all time)' or 'dynamically possible total history of 
the system'; and in the jargon of stochastic processes, to a 'trajectory' or 'realization'. 
I shall write w for a possible world, and W for the set of possible worlds envisaged. 

I should note here two differences of notation from stochastic process theory. (1): 
That theory usually takes the set of realizations ('worlds') as the basic sample space, 
with an event, i.e. a set of realizations, being given by a time-indexed random variable 
taking a certain value. This means that although I have written prt{E) where it is 
understood that E might specify the value, say q, of physical quantity Q, in stochastic 
process theory one writes instead something like pr{Qt = q) = pr ({realizations w : 
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(2): Stochastic processes are often assumed to be Markovian. I will not need the 
definition of this; but it prompts one to use a notation like my prt to represent the 
probability prescribed, not by (i) how all of history up until the hypersurface t happens 
to turn out, but by (ii) the instantaneous physical state at t itself. So I stress that my 
notation prt and similar notations below does mean (i): I shall not need the Markov 
property. 

It is also natural to use the fact that M. is fixed so as to identify times t between 
worlds w G W, yielding a doubly-indexed probability function pr^ Here, identify- 
ing times between two worlds does not mean the very suspect idea of a "meta-time" 
somehow external to both worlds; nor the technical idea from topology of identifying 
parts of two spaces to define a single third space. It is to be understood just as the 
two worlds matching on all their history up to two hypersurfaces each in its world: 
given such matching, the hypersurfaces are then identified in the sense of being both 
labelled t. And here for two regions in two possible worlds to 'match' means that 
they are isomorphic with respect to all properties and relations intrinsic to the regions. 
In the jargon of a physical theory: there is a smooth bijection between the regions' 
points (and so between their sub-regions) that is an isomorphism of all the fields, and 
whatever other physical quantities, the theory discusses0 

Philosophers will recognize the apparatus I have invoked — events associated with 
regions, possible worlds which can match on regions, and time-dependent objective 
probability functions — as reminiscent of David Lewis' metaphysical system. (Cf. es- 
pecially his (1980, 1986): Lewis calls such probabilities 'chances'.) On the other hand, 
Hellman (1982) does not use worlds and chances. He talks of formalized physical the- 
ories with vocabulary for probability, and their models. So I should here admit that 
indeed, philosophical differences turn on the choices between these frameworks. But 
fortunately, the differences will not affect anything in this paper; in particular, nothing 
in what follows needs any of Lewis' contentious doctrines about worlds or probabilities. 

2.3 Three formulations of SEL 
2.3.1 The formulations 

With these preliminaries, it is easy to write down three different formulations of SEL. 
The first concerns the probability of a single event E. Both the second and third 
are statements that E is stochastically independent of certain other events F . Each 

*Two minor points, one specific and one general. (1): We shall see that matching in just a past 
light-cone, rather than in the entire past of a spacelike hypersurface, is enough for some formulations 
of SEL. (2): Note that the idea of two worlds matching up to a time but not thereafter is very different 
from the idea of a single world branching or splitting. Though the latter idea has been developed, 
and used to analyse quantum nonlocality, by the 'Pittsburgh-Krakow school' (cf. footnote 16 for 
references), I will not need it in this paper. For cautions about making sense of branching spacetime, 
cf. Barman (2006). 



formulation will lead on to the nextjfl 

First, it is natural to formulate SEL's idea that a probability is determined by 
history within a past light-cone, as follows: for two worlds w,w' & W that match in 
their history in C~{E) nC~(t), the probabilities pr^ „,(£') and prt^w'{E) are equal. This 
gives our first formulation of SEL, which I will call 'SELS'. The second 'S' stands for 
'single', since we consider a single event E. (For reasons about quantum field theory, 
given in Section HI this second 'S' also stands for 'satisfied'.) Cf. Figure 1. 




Figure 1: SELS 



SELS: Let worlds w,w' & W match in their history in C (E) fl C (t). 
Then they match in their probability at t of E: 

prt,UE) =prt,w'{E). (2.1) 

Here, as discussed in Section 12.21 we envisage that the matching histories 
in w,w' justify the identification of the two hypersurfaces labelled t, and 
that in both worlds t divides C~{E). 

But there is another equally natural approach to formulating SEL. Instead of saying 
that the unconditional probability of a single event E is determined by the "truncated 
cone" of history in C~{E) fl C~{t), we can instead say that the probability of E is 
unaltered by conditionalizing on an event F that occurs in the Elsewhere, i.e. spacelike 
to E. Here, philosophers will recognize that we connect with Reichenbach's Principle 
of the Common Cause (PCC), and its legacy; but I will postpone the comparison with 
PCC until Section 

But we need to be careful about exactly which events, the given event E is to be 
stochastically independent of. For we are considering a probability function that is 

^Besides, other authors have other formulations. My first and third formulations (essentially as 
in my (1994, Section 5.2)) are analogous to Hellman's (1982) two formulations; though he then adds 
provisos to ensure that SEL is obeyed in the Bell experiment; cf. Section [3.3.0.2l My second formulation 
below seems to be new. On the other hand, I will not here pursue my (1994, Section 5.2)'s formulation 
using counterfactuals with probabilistic consequents. 
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time-dependent, and even world-dependent: pvt or pvt^w So consider an event F that 
is future to t and spacelike to E^s region, and yet close enough to E that the intersection 
of the past light-cones C~ (E) fl C~ {F) has a part (including its "summit" ) future to t: 
i.e. C~{E) n C~{F) n C~^{t) 7^ 0. Such an event F may well convey information about 
how history happens to turn out in the future of t, viz. just before the "summit" of 
C~{E) n C~{F). So F may convey information about the prospects for whether E 
occurs — so that E is not stochastically independent of F, according to prt (or prt^w)- 
In short: conditioning on such an F gives information about those of i?'s antecedents 
that lie in the future of t. 

There are two natural solutions to this problem. They will yield our second and 
third formulations of SEL. 

The first solution requires that F be not only future to t and spacelike to E, but 
also far enough away (spatially) from E that C"(E) n C~{F) n C+(t) = 0. With this 
requirement, any common causes of E and F, whose influence on F or F propagates 
at most as fast as light, must lie in the past C~{t) oft. (This follows, whatever exactly 
we decide to mean by 'common causes', provided they lie in C~{E) fl C~{F): cf. the 
second requirement (ii) in the assumption that t divides C~{E).) So the idea is that all 
information about such common causes is already incorporated in prf, and so adding 
that F in fact occurs gives no new information about E. Besides, by making our 
formulation of SEL consider all hypersurfaces t to the past of E, we can argue that 
even events F, spacelike to E but spatially very close to F, will be required to be 
stochastically independent of F, according to the probability functions prt (or prt,w) 
for sufficiently late t (i.e. for t so late that C-{E) n C-{F) n C+(t) = 0). 

Thus we get another formulation of SEL. I will call it 'SELDl', where the 'D' stands 
for 'double', since we consider two events F and F. (For reasons about quantum field 
theory, given in Section HJ the 'D' also stands for 'denied'.) Cf. Figure 2. 




Figure 2: SELDl 



SELDl: For any world w, for any hypersurface t earlier than the event F 
and dividing C~{E), and any event F future to t and spacelike to F and 

in 



such that C-{E) n n C+{t) = 0: 

prt^^iEkF) = pn,^{E).pn^^{F) @ (2.2) 

Remark: One might strengthen SELDl by dropping the restriction that F be future 
to t, i.e. by letting F either be in C~{t) or "he across" t. But as will be clear from our 
discussion of SELD2, this strengthening is not needed for our purposes. 

The idea of the second solution is to require that F be not only spacelike to E, but 
in the past of t, i.e. F lies in the difference C~{t) — C~{E). This idea raises three issues: 
which I will consider in increasing order of significance for our discussion. Only the 
third issue is in any sense a "problem" : it will lead directly to SELD2. It also introduces 
the important idea that objective probability evolves over time by conditionalization 
on how history happens to turn out. 

(i) : This requirement will exclude any F for which C~{E) fl C~{F) fl C^{t) ^ 0. 
This is so even if, as allowed in Section [?!T] . t is not everywhere spacelike. For the stable 
causality condition implies that for any region X: if X C C~(t) then X fl C~^{t) = 0; 
one then applies this to X := C~(E) n C~(F). 

(ii) : Recall that SELDl endeavoured to cover all appropriate events F by con- 
sidering all hypersurfaces t to the past of E, even ones just before E. So also here: 
by considering all such hypersurfaces, we will endeavour to ensure that any F that is 
spacelike to E will be in the past of one such hypersurface. After all, such hypersur- 
faces can "tilt up" towards the future, outside C~{E), so as to include most of the 
Elsewhere of E. So any event F which we intuitively want to claim to be stochastically 
independent of E can get included in our formulation of SEL. 

(iii) : Requiring F to be in the past of t, and then considering the conditional 
probability function prt{ /F) (or prt^w{ /F)), raises the question what should be the 
objective probability of a past event that actually occurs. The usual answer, in both 
the philosophical and technical literature, is that this probability is unity. This implies 
that conditioning prt{ ) on F makes no difference: that is: prt{ ) = prt{ /F). So we 
cannot express our intuitive idea that E is stochastically independent of F because F 
is spacelike to E, by using prt. 

At this stage, the obvious tactic is to use a probability function determined, not by 
all of history up to t, but by the history lying both in the past of t and within C~{E). 
So the idea is now that within a world w, the history up to t and within C~{E) 
prescribes a probability function, according to which E is stochastically independent 
of any possible event F outside C~{E) but earlier than t. Writing H for this history, 
and prH,w for this probability function, we get another formulation (with the 'D2' 
indicating that this is our second double-event formulation): cf. Figure 3: 

write cq. 12. 21 rather than prt, w{E / F) = prt,w{E), so as to avoid niggling provisos about non-zero 
probabilities. I also adopt this tactic in what follows. But anyway such provisos can be largely avoided 
in measure theory by using the idea of conditional expectation (e.g. Loeve 1963, p. 341). 
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Figure 3: SELD2 




prnAE^^) = PrHAE)-prHAFy, (2-3) 



where H is w's history in the intersection C (t) H C (E). 

Note that the surface that forms the boundary of the intersection C~{t) fl C~{E) 
consists partly of (part of) the boundary of a past hght-cone, viz. C~{E). (The 
intersection is hke a "Table Mountain", whose slopes are part of the boundary of 
C~{E)). Since we have written H for the history in this intersection, it is natural to 
write dH for the boundary, and so to write eq. I2.3l with the surface dH as a subscript, 
on analogy with SELDl and eq. 12. 2t 



Finally, there is an important generalization of the idea that the objective prob- 
ability of a past event that actually occurs is unity: a generalization which is also 
usually accepted in both the philosophical and technical literature. Namely: objective 
probability evolves over time by conditionalization on the intervening history. That is 
to say, in our notation: if the hypersurface t is later than hypersurface t', and H* is 
the conjunction of all events that occur between t' and t, then 



Cf. for example, Lewis (1980, p. 101). Of course, any realistic theory is likely to allow 
continuously many different possible histories between t and t', so that any one of 
them is liable to have probability zero. In that case, eq. 12.51 can be made sense of using 
conditional expectation; cf. footnote 6. 



proH,w{EkF) = praH,w{E).proH,w{E); 



(2.4) 



prt{ ) = prt'i /H*) . 



(2.5) 
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2.3.2 Comparisons 



So much by way of stating three formulations of SEL. Before considering their relations 
(Section 12. 4p . I briefly comment on the similarity of the two SELD formulations to (i) 
Reichenbach's PCC and (ii) Bell's condition of 'local causality'. The main point is of 
course that the shared broad idea is so natural that it is no surprise that it is formulated 
independently by authors in different literatures: presumably, Bell had never heard of 
Reichenbachlj 

About PCC: — In his PCC, Reichenbach's idea (1956, Section 19) was that if: 

(i) : events E and F are correlated according to some objective probability function 
pr, pr{E&:F) ^ pr{E) ■ pr{F); while 

(ii) : there is no direct causal relation between E and F: 

then E and F must be joint effects of a common cause. Reichenbach took this to mean 
that there must be a third event C in the "common past" of E and F that "screens 
them off", in the sense that their correlation disappears once we conditionalize on C: 

pr{EkF/C) = pr{E/C) ■ pr{F/C) E (2.6) 

The idea of eq. 12.61 can be put in vivid epistemic terms, if we imagine an agent whose 
initial credence is given by pr and who conditionalizes on what they learn. Namely, the 
idea is: since by (ii) F does not causally influence E, knowledge that F occurs gives 
no further information about whether E will occur, beyond what is already contained 
in the knowledge that C occurred; (similarly, interchanging E and F). We saw this 
same idea in our discussion leading to the SELD formulations, about SELD needing 
to exclude (from the claim of stochastic irrelevance to E) events F so close to E that 
their occurrence gives information about the prospects for E, beyond what is already 
encoded in prt. 

The main similarities and differences between PCC and SELD are clear. Both state 
screening-off between a spacelike/causally unrelated pair of events. But where PCC 
says that some past event screens off, SELD is logically stronger. It is more specific 
about the probability function being time-dependent, and so about which events to 
exclude from the claim of stochastic irrelevance; (though this can be done in two 
ways, yielding SELDl and SELD2). It also quantifies universally over earlier times 
(hypersurfaces) t, and so over histories H. It also differs in two other ways: 

^Similarly, it would be worthwhile to compare our SEL with formulations of the same broad idea by 
other authors : for example, Penrose and Percival's principle of conditional independence (unearthed 
and endorsed by UfEnk 1999) and Henson's (2005) screening-off conditions. Worthwhile: but we have 
jobs enough for this paper. 

*More precisely, Reichenbach required, in addition to eq. 12.61 that: (i) -iC also screen off E 
and F; and (ii) pr{E/C) > pr{E) and pr{F/C) > pr{F). From these assumptions, he proves that 
pr{EhF) > pr{E) ■ pr{F) — thus making good the claim that C explains the E — F correlation. But 
we can ignore (ii), and treat positive and negative correlations equally; and we can postpone (i) for a 
while, viz. until Section [3.2.1l when we will take it in our stride in discussing common cause variables. 
Rest assured: both these moves are endemic in the PCC literature; for example, they are made by 
UfFink (1999), Henson (2005) and the Budapest school discussed in Section [3721 



(a) : It proposes as the screener-off the entire history up to t (or for SELD2: up to 
t and within C~{E)), not some individual event. We will return to this difference, in 
Section 13.1.21 and later. 

(b) : It places its screener-off in a subscript (cf. eq. 12.21 eq. 12.31) labelling a 
probability function, rather than behind a conditionalization stroke. Again, we will 
return to this difference, especially in Sections I2.4.2[ I3.1.1l and l3.2.3[ For the moment, 
note only that we can already see how this difference might be elided: cf. eq. 12. 5[ 

About Bell: — Bell's 'local causality' (2004, p. 54 (2)) is cast in his preferred lan- 
guage of 'beables'. But once translated into Section I2.1f s language of events and 
histories, it is very close to SELD, especially SELD2. No surprise: Hellman (1982, 
p. 466) and Butterfield (1994, p. 408) admit the inspiration. Thus local causality 
says: let E and F be spacelike events in a stochastic theory, let be the history in 
the intersection C^{E) fl C^{F) of their past light cones, let F' be an event in the 
remainder of C~{F), i.e. C^{F) - {C~{E) n C~(F)): then 

pr{EkF'/N) = pr{E/N) ■ pr{F'/N)E (2.7) 

This is just like SELD2, except that instead of history H within C~{t)r\C~{E) defining 
a probability function prn that screens off E and F (eq. 12.31) . now history within 
C~{E) n C~{F) does so — and does so by conditionalizing on A^ some given function 
pr; (which is non-time-indexed, presumably because of implicitly assuming something 
hke eq. 1231 ) 

2.4 Implications between the formulations 

These three formulations, SELS, SELDl and SELD2, are all natural expressions of the 
idea of stochastic Einstein locality. But strictly speaking, they are inequivalent, since 
they discuss events in "different" spacetime regions, and different probability functions. 
However, I shall state some conditions under which they can be shown equivalent. 
Section 12.4.11 concerns the conditions for SELDl and SELD2 being equivalent; and 
Section [2.4.2l discusses conditions for the equivalence of SELS and SELD2 (and thereby 
SELDl). 

2.4.1 Conditions for the equivalence of SELDl and SELD2 

I shall show that: (i) SELDl implies SELD2; and (ii) under certain conditions, SELD2 
implies SELDl. Both proofs will use the fact that by our allowance in (a) of Section 
12. H the hypersurfaces in SEL can include parts of the boundaries of past light-cones. 

^In fact, Bell also conditionalizes on a specification of some beables in the remainder of C^{E); 
but I set this aside, i.e. I take Bell's special case, the "empty/null specification". Incidentally, Bell 
also emphasizes (2004, pp. 105-106) how comitless everyday macroscopic events support the broad 
common idea of PCC and local causality. 
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2.4.1.1 SELDl implies SELD2 Proof: We are given a world w, an event E, a 
hypersurface t earlier than E and dividing C~{E), and an event F in the difference 
C~{t) — C~{E), and a probability function Ph,w where H is w^s history in the inter- 
section ^-(t) n 

We note that since F C C~{t), C~{F) n C~{E) C C~{t) n C~{E). But this latter 
is the region associated with the history H . So SELDl applies, but now taking as 
the hypersurface in SELDl, the boundary of C^{t) n C~{E). Thus SELDl's eq. O 
becomes SELD2's eq. 12.31 or with the boundary OH as label, eq. 12. 4t as desired. QED. 

To prove the converse, that SELD2 implies SELDl (under certain conditions), we 
first need a lemma which states a "concordance" between SELDl's and SELD2's con- 
ditions on spacetime regions: i.e. on the spatiotemporal, especially causal, relations 
between E, F and t. (So this concordance is not about their requirements on proba- 
bility functions.) 

Concordance: For any event E, any hypersurface t earlier than E and 
dividing C~{E), and any event F spacelike to E: 

If C~ {E)nC~{F)r\C~^ (t) = 0, then there is a hypersurface t' that coincides 
with t across all of C~{E) (and so also divides C~{E)) and is such that 
F C C-{t'). 

Proof: We use the fact that by (a) in Section 12. H our hypersurfaces can include parts 
of the boundaries of past light-cones. Thus we construct t' by defining it to be the 
boundary of the region [{C~{E) fl C~{t)) U C~(F)]. (So this region is like a "table 
mountain" {C~{E) n C~(t)) beside a "summit" C~{F). Note that because C~{E) n 
C~{F) n C'^{t) = 0, F cannot be so close to E that the "table-top" becomes a "mere 
ledge" on the slope of the mountain C~{F).) Or alternatively we can define t' to consist 
of the boundary of C~{t) U C~ (F): this will make t' coincide with t everywhere, except 
at the "summit" of C~{F), where t' will be the surface of the summit. QED. 

(Though it is not needed for what follows, we remark that one can prove the converse 
of Concordance, by showing that if t and t' coincide in C~{E) and both divide C~{E), 
then C+(t') n C~{E) = C+{t) n C~{E). So if the intersection of C~{F) with the left 
hand side is empty, i.e. C~{F) f] C'^{t') n C~{E) = 0, then the intersection of C~{F) 
with the right hand side: yielding the converse.) 

2.4.1.2 SELD2 implies SELDl, given two assumptions It will be clearest to 
introduce the assumptions we need, by embarking on a proof of the implication and 
introducing them as the need for them becomes clear. 

So we are given a world w, an event E, a hypersurface t earlier than E and dividing 
C~{E), and an event F future to t, spacelike to E and such that C~{E) fl C~{F) fl 
C+lt) = 0. We seek to show 

pn{EkF) = pn{E).pn{F) ■ (2.8) 



where we have suppressed the world index from eq. I2.2[ 

First, we infer from Concordance that there is a hypersurface t' that coincides with 
t across all of C~{E) and such that F C C^(t'). So SELD2 applies, with H being the 
world w's history in the intersection C~(t) fl C~{E) = C~(t') fl C~{E). That is, we 
have eq. 12.31 i.e. again suppressing the world-index: 

prniEkF) = pr h{E) .pr h{F) . (2.9) 

Or in terms of the surface dH, we have eq. 12. 4t prgniESzE) = pr qh{E) .pr qh{F) . 

But to get eq. 12.81 we still need to evolve probabilities forward from the hypersurface 
dH, i.e. the boundary of C-{t) n C-{E) = C-{t') n to t itself. That is: we 

need to show that the stochastic independence in eq. 12.91 is not lost as one evolves 
forward from dH to t. 

To show this, we need further assumptions. After all, events in C~{F) — C~{E) 
can perfectly well influence F subluminally; and for all we have so far said, they might 
spoil the stochastic independence in eq. 12.91 

There are no doubt various assumptions that would secure the implication. I think 
it most natural to make the following two assumptions. (For the argument to follow, 
we will no longer need to consider the hypersurface t' inferred from Concordance.) 
The first is entirely general: it is that probabilities evolve by conditionalization on the 
intervening history, i.e. eq. 12.51 

The second assumption is specific to SEL. It is an equation of ratios of probabilities, 
viz. eq. 12.171 below. But it will be clearest to introduce it when we see its role in the 
proof. Suffice it to say initially that the equation combines two ideas about events E 
and F, and a hypersurface t, given as in the hypothesis of SELDl: 

(a) : The events in C~{t) fl (C~(F) — C~{E)) contain exactly the same influences 
on F (perhaps more neutrally: contain all the information about the prospects for F 
happening) as are contained in the events in C^{t) —C^{E). The latter region is larger 
since for any sets X^Y^ Z : X r\{Y — Z) G X — Z. But the events in the difference, i.e. 
in C-{t) - C-{E) but not C-{t) f] {C-{F) - C-{E)), are spacelike to F. So this idea 
reflects the prohibition of superluminal influences on F. 

(b) : Neither of the regions, C" (t) n (C" (F) - C" (E) ) and C" (t) - (E) , contains 
influences on E. This reflects the prohibition of superluminal influences on E. 

So now let us assume that probability functions evolve by conditionalization on 
intervening history in the sense of eq. 12.51 Then if H* is w's history in the difference 
C~{t) — C~{E), and (as before) H is the history in the intersection C~{t) fl C~{E), 
we have: 

prtA)=PrHA/H*). (2.10) 

We combine this with the instance of SELD2 that takes F in SELD2 to be the whole 
of the intervening history H* in the difference C~{t) — C~{E). So eq. 12.31 (i.e. 12. 9p 
and 12.101 gives 

pn^E) = prH,UE/H*) = prnAE) ■ (2.11) 
Ifi 



Let us now write He-, rather than our previous iJ, for the history in C~{t){~\C~{E). 
Similarly let Hp be the history in C~{t) fl C~{F). (We will not need the t label to 
be explicit in such Hp, Hp; nor will we need the world-index.) Then eq. 12.111 can be 
written as: 

pniE) =prH^{E) . (2.12) 

An exactly similar argument for the event F given in the hypothesis of SELDl (i.e. 
future to t, spacelike to E, and such that C~{E) n C~{F) n C+(t) = 0) yields 

prt{F) = pthAF") ■ (2.13) 

So what we seek, eq. 12.81 becomes 

pniEkF) = prHAE)-prHAF) ■ (2.14) 

And what we know, SELD2, eq. 12. 3[ is now 

prn^ (EkF) = prn^ (E) .prn^ (F) . (2.15) 



Now we see that we can complete the proof by assuming an equation of ratios of 
probabilities; as follows. Since pr^E {E) is in both eq. 12.141 and 12. 15^ eq. 12.141 will 
follow immediately if we assume: 

prtjEkF) ^ pthAE^F) 

prnAF) WhAF) ' ^ ' ' 

Equivalently, it suffices to assume: 

prt{EkF) pruAF) 



prnAE^F) prH^F) ' 



[2.n) 



Eq. 12.171 is, I submit, a plausible assumption. Though it equates relative proba- 
bilities (i.e. ratios of probabilities), the obvious justification for it lies in identifying 
two bodies of information (i.e. sets of events), one associated with C~(t) — C~{E) and 
the other with [{C-{t) n C-{F)) - {C-{t) n C-{E))] = [C-(t) n (C-(F)) - C-{E))]. 
For short, I will temporarily write this latter region as [F — E]t. Note that this region 
is "most" of C~{F) — C~{E): think of how in a two-dimensional spacetime diagram, 
C~{F) — C~{E) forms an diagonal strip extending infinitely in to the past; [F — E]t is 
this strip apart from its "summit" future to t. The justification of eq. 12.171 now goes 
as follows. 

(i) : The information about the prospects for EkF happening, that is associated 
with C~{t) — C~{E), is represented by the left hand side of eq. 12.171 (For Hp is the 
history in C~{t)r\C~{E).) If the left hand side is greater than 1, then that information 
promotes EhF in the sense that the events in C~{t) — C~{E) make E&lF more likely. 

(ii) : Similarly: the information about the prospects for F happening, that is as- 
sociated with [F — E]ti is represented by the right hand side of eq. 12.171 If the right 



17 



hand side is greater than 1, then that information promotes F in the sense that the 
events in [F — E]t make F more hkely. 

Why should these quantitative measures of promotion (or inhibition) be the same? 
Surely the most natural justification is that these two bodies of information (sets of 
events) are the same. That is: the information about the prospects for EhF happening, 
that is associated with C~{t) — C~{E), is exactly the information about the prospects 
for F happening, that is associated with [F — E]t. As I mentioned in (a) and (b) just 
above eq. I2.10[ this identity reflects the prohibition of superluminal influences on F, 
just as much as on i?. 

So to sum up the justification of eq. 12.171 (equivalently: 12.81 and 12.141) : — The 
prohibition of superluminal influences makes these two bodies of information the same; 
and if they are the same, the prospects for the respective events will match, giving eq. 
I2.17[ For example, if this information is "positive", i.e. promotes the events EhF and 
F respectively, then the ratio in eq. 12.171 will be greater than one. QED. 



2.4.2 Conditions for the equivalence of SELS and SELD2 

I turn to stating some conditions under which SELS and SELD2 are equivalent!^ (So 
using Section 12.4. 1^ we could deduce conditions under which SELS and SELDl are 
equivalent.) 

Suppose we assume that: 

(i) probabilities evolve by conditionalization eq. 12. 5| 

(ii) all the worlds in W have the same initial probability function pr; this will mean 
that we do not need a world- subscript w; (Lewis considers this: 1980, p. 112, 131). 

It is then easy to show that SELS and SELD2 are equivalent, if we make the technically 
simplifying pretence that the worlds of W have: 

(a) only finitely many possible histories, Gi say, for the region C~{t) — C~{E)\ and 

(b) only finitely many possible histories, Hk say, for the region C~{t) fl C~{E). 
(To generalize the proof that follows to realistic numbers of histories would require us 
to apply conditional expectation to extremely large measure spaces, cf. footnote 6: but 
I shall not attempt this.) 

Assumptions (i) and (ii) imply that: in SELS, we can write prt,^(-E') aspr{E/GiSzHk), 
where w has histories Gi and up to t; and in SELD2, we can write prH,w{E) as 
pr{E/ Hk). The proof then applies the elementary result that for any probability func- 
tion p, with {Yi} as a partition of its space and the p{Yi) non-zero, and for any event 
X: 

[V^, piX/Yi)=piX)] iff [V^ and J, p{X/Yi) =p{X/Yj)] . (2.18) 
To apply eq. 12.181 we note that assumptions (i) and (ii) mean that eq. 12. II becomes 

yk,i,i' : pr{E/GikHk) = pr{E/Gi>kHk) . (2.19) 

^"What follows repeats my (1994, pp. 409-410); but is worth repeating, not just for the sake of 
completeness but also for some points in Sections [3] and |4] . 
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By eq. [2181 this is so iff 



Wk,i: pr{E/G,LHk) = pr{E/Hk) . 



(2.20) 



This gives eq. 12.31 i.e. SELD2, for the special case where F is a maximally strong 
event, one of the Gi. For the general case, we treat F as the exclusive disjunction 
of the various total histories Gi that include it (in effect: include it as a conjunct), 
and we sum over this limited range of Gi. Thus in any world with history H within 



where the last line applies the special case eq. 12.201 already obtained. For the converse 
entailment, from SELD2 to SELS, we again take the special case of eq. 12.31 where F is 
a Gi. That is, we take eq. 12.201 Then we apply eq. 12.181 to get eq. I2.19[ i.e. SELS. 

3 Relativistic causality in the Bell experiment 

So much by way of introducing various formulations of SEL. In this Section and the 
next, I will discuss how these formulations fare in quantum physics: first, in the Bell 
experiment of elementary quantum mechanics (this Section); and then in quantum field 
theory, in its algebraic formulation (AQFT: Section Hj). For both quantum mechanics 
and AQFT, the gist of the discussion will be that violation of outcome independence 
(a central assumption in a proof of a Bell inequality) implies that one or another 
formulation of SEL is violated. 

In this Section, I begin by reviewing the Bell experiment, and some "ancient" 
discussions by me and others of SEL and PCC (Section 13. ip . In Section [221 I first 
review the Budapest school's resuscitation of the PCC, based on the distinction between 
every correlation having a common cause, and all correlations having the same such 
cause. Then I endorse the recent arguments of Placek and the Bern school that, after 
all, outcome dependence does impugn PCC. Finally in Section 13. 3[ I carry over these 
arguments to SEL: after the long Section 13.21 that will be easy work. It will also be 
clear that the discussion generates some open questions. 

3.1 The background 

Section 13.1.1! describes local models of the Bell experiment in terms which are both 
common, and needed for the reviews of previous work: both my own in Section 13.1.21 
and others' in Sections 13.21 



G-{t)nc~{E): 



pr(E/FkH) =pr{EkF/H) / pr{F/H) 
= [T.,pr{EkG.jH)] I [S,pr(G,/iJ)] 
[S, pr{ElGih}i)\ ■ pr{G,/H) / [S, pr{G,/H)] 
= pr{E/H)-[^,pr{G,/H)] / [^,pr{G,/H)] 

= pr{E/H) ; 



(2.21) 
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3.1.1 The Bell experiment reviewed 

A stochastic local model of the spin (or polarization) version of the traditional two- 
wing Bell experiment postulates a space A of complete states of the pair of particles. 
We represent the two possible choices of measurement (of spin-component) on the left 
(L) wing by ai, and on the right (R) wing, by 61, 62- The idea is that A G A encodes 
all the factors that influence the measurement outcomes that are settled before the 
particles enter the apparatuses, and that are therefore not causally or stochastically 
dependent on the measurement choices: (we will later be more precise about this). 
So a state ("hidden variable") A specifies probabilities for outcomes ±1 of the various 
single and joint measurements: 



We also represent measurement outcomes by Ai,Bi,i = 1,2, where Ai = ±1 is the 
event that measuring has the outcome ±1. We will also variable over 

01,02; X as a variable over Ai,A2 and their negations (i.e., outcome -1); and for the 
right wing, we similarly use y and Y. 

Observable probabilities are predicted by averaging over A. For example, the ob- 
servable left wing single probability for Ai = +1 is: 



(Some treatments include apparatus hidden variables, i.e. factors in each appara- 
tus influencing the outcome, so that pr\^ai is itself an average over hidden variables 
Xl associated with the L- apparatus, using some distribution say: prx^aii^) = 
/ prx,ai,\L{X) dpL- A Bell inequality is still derivable, given that the L and R ap- 
paratus hidden variables are suitably independent. But I think my arguments would 
be unaffected by this complication, and I set it aside.) 

Note that eq. I3.1f s subscript notation prompts one to think of joint measurements 
in terms of four copies of the probability space that is the Cartesian product of A and 
the tiny 4-element space {< +!,+! >,...,< —1,-1 >}; where each copy is labelled by 
one of the four joint choices < ai,bi >, < 02, 62 > of measurement, and the measure 
on the probability space is fixed by p and the joint probabilities in eq. 13.11 Similarly, 
of course for single measurements. In other words: we can think of joint measurements 
in terms of four vector- valued random variables < oi, 61 >,...,< 02, ^2 > on the space 
A, all four random variables having {< +1, +1 >, < —1, —1 >} as codomain; (and 
similarly for single measurements, with each of ai,a2,fci,&2 a random variable having 
{+1,-1} as codomain). 

Since philosophers tend to think that not every event (or fact or proposition) has a 
probability, this "many spaces approach" has the advantage that by not representing 
the choices Oi, ...,^2 as events in a probability space, they do not need to be assigned 
a probability. But it has the disadvantage of making it harder to say that there is no 
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causal or stochastic dependence between the choices and other events or facts, espe- 
cially the value of A. This disadvantage can be overcome by using a "big space", whose 
elements are ordered quintuples < X,x,y,X,Y >; with the variables x, y, X, Y under- 
stood as above. We can then say, for example, that each value of A is stochastically 
independent of each measurement choice: 

pr(A&x) = pr{\) ■ pr{x) ; pr{\Szy) = pr{\) ■ pr{y) . (3.3) 

The measure pr here can be defined straightforwardly from p, the probabilities given by 
each A i.e. eq. 13. and postulated probabilities for the measurement choices x, y. For 
the moment, I will continue with the many spaces approach, but in Sections 13.2.31 and 
13.2.41 we will return to the big space approach. (For more discussion of the approaches, 
cf. Butterfield (1989, p. 118; 1992, Sections 2, 3), Berkovitz (1998a Section 2.1; 2002, 
Section 3.3; 2007, Section 2).) 

We now assume "locality" in two ways. First: the measure p, by which we average 
over A to get observable probabilities, is independent of the measurement choices: we 
do not write p^^y or, for a single L measurement, Px^ 

Second: the joint probabilities prescribed by each value of A are assumed to factorize 
into the corresponding single probabilities: 

VA; Va;, y; VX, Y = ±1 : prA,.,,(X&F) = pr,,,(X) ■ pr,,,(r) . (3.4) 

Eq. 13.41 is called 'factorizability' or 'conditional stochastic independence'. We will of 
course return to its relation to the idea of A as a common cause of the outcomes, and 
so to the PCC and SEL. For now, we note that it is the conjunction of two disparate 
independence conditions; (for the proof, cf. e.g. Redhead (1987, pp. 99-100), Bub 
(1997, p. 67)). The first is, roughly speaking, independence from the measurement 
choice in the other wing; called 'parameter independence' (where 'parameter' means 
'apparatus-setting'): 

VA; X, y;X,Y = ±l: pr^^X) = Wx,x,y{X) := pr^^^A^kY) + prA,.,,(X&-F) ; 

(3.5) 

and similarly for R-probabilities. The second condition is, roughly, independence from 
the outcome obtained in the other wing: 'outcome independence': 

VA, X, y;X,Y = ±l: prx,x,y{XkY) = prx,.,y{X) ■ prx,.,y{Y) . (3.6) 

Bell's theorem states that any stochastic local model in this sense — especially, obey- 
ing eq. 13. 4[ or equivalently, eq. 13.51 and 13.61 — is committed to a Bell inequality gov- 
erning certain combinations of probabilities (e.g. Redhead (1987, pp. 98-101), Bub 
(1997, pp. 56-57), Shimony (2004, Section 2)): which is experimentally violated. On 

^^But we will later discuss how such dependence is correct, if A represents, not just all causal factors 
that are settled before the particles enter the apparatuses, but also the measurement choices. It can 
be convenient to adopt this approach: cf. Section fS. 2. 4[ s discussion of Szabo (2000). 
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the other hand, quantum theory is not thus committed, and many experiments confirm 
the quantum theoretic predictions. 

So which assumption of the derivation of a Bell inequality, or of its experimental 
testing, is the culprit? The usual verdict is: outcome independence; and in this paper 
I will endorse this verdict, so as to pursue the consequences for PCC and SEL. Indeed, 
this verdict is suggested by quantum theory's obeying parameter independence but not 
outcome independence. More precisely: putting a quantum mechanical state for each 
A, and taking the probabilities at a given A, eq. 13.11 to be given by the orthodox Born 
rule, we infer that: 

(i) : eq. 13.51 holds: it is now a statement of the quantum no-signalling theorem, 
following from the commutation of the L- and R-quantities; but 

(ii) : eq. 13.61 fails, except in some special cases such as the model's quantum state 
being a product state. 

Besides, quantum theoretic descriptions of a much less idealized Bell experiment retain 
this contrast between parameter independence and outcome independence: that the 
first holds, and the second fails. 

But I emphasise that this usual verdict is a matter of judgment; for two reasons. 
First, rival interpretations of quantum theory (especially: approaches to the mea- 
surement problem) can motivate different verdicts. The best-known example is the 
pilot-wave approach. Here it is natural to take A to include both the quantum state, 
and the "hidden variables", e.g. point-particle positions in elementary quantum me- 
chanics: in which case, parameter independence fails and outcome independence holds. 
The quantum no-signalling theorem is nevertheless recovered at the level of observable 
probabilities by averaging over the hidden variables using the "quantum equilibrium" 
distribution (i.e. the Born rule). Other examples include dynamical reduction theories 
and many-worlds interpretations: even though in some cases of these examples, the 
one-liner verdict is the same as above — i.e. outcome independence fails — the interpre- 
tation of this failure can be very different. (Cf e.g.: Butterfield et al. (1993) and Kent 
(2005) for dynamical reduction theories; and Bacciagaluppi (2002, Sections 4, 6) and 
Timpson and Brown (2002, Section 4) for many- worlds interpretations.) And these 
different interpretations can have the merit of revealing others' implicit assumptions. 
No doubt, the main case of this is how many- worlds interpretations reveal the implicit 
assumption I have made, that measurement outcomes are genuine, definite events!^ 

Second, people can of course differ about which assumptions of Bell's theorem 
are false, without having to motivate their opinion by anything so ambitious as an 
interpretation of quantum theory. The two best-known ways to save factorizability eq. 
13.41 are usually called the 'detector efficiency' and 'locality' loopholes. I will ignore the 
former, which is usually considered almost closed, and concentrate on the latter^ 

^^But some philosophers have been prompted by desiderata in the philosophy of causation, rather 
than by many-worlds interpretations, to articulate, and even argue for, this assumption; e.g. Butter- 
field (1994, Section 4), Grasshoff ct al (2005, p. 668). 

-•^^For a masterly discussion, cf. Shimony (2004): his Sections 3 to 5 consider loopholes. Santos 
(2005, especially pp. 555-561) is a heterodox but admirably detailed discussion. I also set aside other 
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The locality loophole is the idea that A is not (causally or stochastically) indepen- 
dent of (one or both of) the measurement choices, either because of causation between 
them or because both are influenced by a common cause. But most people regard the 
latter as an incredible conspiracy, and the former as ruled out by experiments in which 
the measurement choices are spacelike to the emission of the particles from the source. 
I will concur with this (despite footnote 13)1^ 

Note that in such an experiment each choice-event is also spacelike to a finite, 
albeit perhaps small, region comprising the "summit" of the past light-cone of the 
emission event. On account of these experiments it has become common, and it seems 
reasonable, for philosophical discussions of the fate of PCC in the Bell experiment to 
assume that the choice-events occur after any common cause event: we will return to 
this, especially in Section [3. 2. 5[ 

3.1.2 My previous position 

As I mentioned in Section [H there are two motivations for exhibiting the violation of 
SEL in the Bell experiment. First: our experience of the world, as codified in classical 
and relativistic physics, suggests that — in a slogan — causal processes cannot travel 
faster than light; and I submit that SEL is a plausible expression of that idea. So if 
we want to express precisely in which sense quantum theory is a "non-local" theory, 
or what is "spooky" about Bell correlations, the violation of SEL is a good candidate. 
In other words: it is worth exhibiting the violation of SEL, as a way of locating the 
mysteriousness of Bell correlations. 

The second motivation is personal. In previous papers, I argued that outcome de- 
pendence implied that some formulations of SEL were violated in quantum mechanics. 
(This seemed worth showing since some previous work suggested the opposite.) I also 
endorsed the literature's folklore that outcome dependence implied the violation of the 
much more familiar condition, PCC (cf. Section [232]) • Of course, these positions went 
hand in hand, in view of the similarities between PCC and the SELD formulations. To 
set the stage for later discussion, I need to recall these views: first about SEL, then 

loopholes, i.e. deniable implicit assumptions of Bell's theorem. Some are familiar, e.g. allowing some 
kind of backwards causation (Cramer 1986; Price 1996, Chapters 8,9; Berkovitz 2002, Section 5); 
others have only recently been articulated, such as the memory loophole and the collapse locality 
loophole (Kent 2005, and references therein). But I stress that these loopholes make it hard to secure 
a conclusive contradiction between quantum theory and "local realism" ; so that my blaming the Bell 
inequality's violation on outcome dependence, and in the sequel on the falsity of PCC and SEL, 
involves a judgment. Notoriously for philosophers, they also make it hard to infer spacelike causation 
from the Bell correlations (cf. Berkovitz 1998, 1998a, 2007 for detailed reviews; and Suarez 2007); but 
I will not discuss causation, except en passant in connection with the PCC. I also set aside the conflict 
between quantum theory and no77.-local realism, i.e. the derivability of Bell inequalities from certain 
non-local stochastic hidden variable theories: cf. Fahmi and Golshani (2006), Socolovsky (2003) and 
Seevinck (forthcoming). 

^^The first such experiment was by Aspect et al. (1982); cf. also Weihs et al. (1998). Of course, 
'choice' here need not be a human decision: which quantity is measured might be determined by a 
random device; (for some details, cf. Shimony (2004, Section 5)). 
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about PCC. (Section m will report my previous views about the fate of SEL in quantum 
field theory — and defend them!) 

In my 1994, 1 gave three formulations of SEL. The first, called SELl, was essentially 
our (Section l2.3[ s) SELS. The second, called SEL2, was our SELD2. (The third used 
counterfactuals, and here I set it aside; our SELDl was not discussed in (1994).) 
Then I argued that the Bell experiment in elementary quantum mechanics violated 
SEL (i.e. all formulations). (In this I disagreed with Hellman (1982), who hedged 
his corresponding formulations of SEL with provisos so as to secure that quantum 
mechanics, despite outcome dependence, obeyed them: details in Section [3.3.0.2[ ) 

In my (1989, 1992), I also joined other authors in holding that outcome dependence 
violated PCC. More precisely, I argued that what I took to be a weaker cousin of 
PCC was violated. For PCC, in Section [2. 3. 2f s form, faces counterexamples that have 
nothing to do with the Bell inequality: everyday life and classical physics provide 
examples in which events A, B and their common cause C — or the event that best 
deserves that name — do not satisfy eq. 12.61 

I argued that we should reply to these examples, by taking as the 'common cause' 
of E and F the total physical state of a spacetime region that deserves (well enough, 
if not best) the name 'common past of E and F\ In particular, it need not be the 
intersection of their past light-cones: in some cases, it needs to include parts of the 
boundaries of these light-cones, that are future to the "summit" of C~{E) fl C~{F). 
(But it is in all subset of the union C {A) U C {B).) Taking C as this total 

physical state, I dubbed eq. l2.6r PPSr (for 'Past Prescribes Stochastic Independence'). 
(Here, 'total physical state' needs of course to be intrinsic; cf. Section |5TT]) 

More precisely, I took this total physical state C to prescribe a probability distribu- 
tion pre, without itself being conditioned on; (Section I3.1.1f s many spaces approach). 
So PPSI said (1989, pp. 123-124): 

PPSI: If spacelike events E and F are correlated but one does not cause 
the other, then: 

prc{EkF)=prc{E).prc{F) (3.7) 

where pre is the probability distribution prescribed by the total physical 
state C of the common past of E and F. 

I argued that by taking the common past as large as possible, and taking its total 
physical state, PPSI was weak, since one was in effect conditioning on more events — 
all those in the common past, both those that causally affect E and F and those that 
do not. 

Then I argued that when we take E and F to be the measurement outcomes of a 
Bell experiment, the common past region could be chosen so as to satisfy three desider- 
ata for proving Bell's theorem: 

(i): it is large, so that PPSI is weak (and plausible since satisfied by countless ex- 
amples in everyday life and classical physics); 
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(ii) : PPSI justifies the outcome independence and parameter independence assump- 
tions of Bell's theorem; 

(iii) : the other assumptions of Bell's theorem, especially A's independence of mea- 
surement choices, are plausible; {contra the loopholes listed at the end of Section [3. 1.1 1) . 
So my overall conclusion was that (as in my papers' discussion of SEL) the mysteri- 
ousness of the Bell correlations can be taken to lie in the violation of PPSI applied to 
measurement outcomes!^ 

Two final comments about PPSI, which will be important in Sections 13.21 and 13.31 

(1) : In taking PPSI to be plausible — especially, a reply to the everyday coun- 
terexamples to PCC — I assume that 'conditionalizing on all the other [i.e. causally 
irrelevant] events in the [common past] does not disrupt the stochastic independence 
induced by conditionalizing on the affecting events' (1989, p. 123). That conditional- 
izing can produce stochastic dependence is sometimes called 'Simpson's paradox'; so 
my position assumes that Simpson's paradox does not apply. We shall return to this 
assumption, especially in Section [3.2.51 

(2) : PPSI is like Section [2l3l s formulations of SEL (and Bell's 'local causality') in 
having the total state of a spacetime region prescribe a probability distribution, rather 
than be an event that is conditionalized on (as in Reichenbach's PCC, eq. 12. 6p . In 
Section [3.1. If s terms: PPSI adopts the many spaces approach. This has two important 
consequences, which will be developed below. 

(2a): The similarity between PPSI and SEL makes it easier work to argue from 
the Bell experiment's violation of PPSI, that it also violates SEL (Section 13. 3p . 

(2b): Conditionalizing on an event as in eq. 12.61 is easily generalized to consid- 
ering a partition of the probability space, and conditionalizing on each of its cells. And 
if one is considering several pairs of correlated events, one can ask whether (i) there 
is a single partition (each cell of) which yields a screening-off, or (ii) there are several 
partitions, one for each pair of correlated events. This will be the theme of Section 
13. 2[ But this contrast can hardly be expressed if, as in PPSI and SEL, we adopt the 
many spaces approach and "go all the way down" to an individual world, or history 
up to a hypersurface, or total physical state of a common past, which we then take to 
prescribe a probability distribution. We will return to this feature of PPSI and SEL 
in Sections 13.2.3.21 and 13. 3[ 

3.2 A common common cause? The Budapest school 

So much by way of recalling my previous views. But these now need to be re-examined 
in the light of the subsequent literature. The first thing to say is that happily, some 
later authors endorse them. For example, Uffink and Henson join me in following 
the lead of Bell's local causality, but allowing the common past to include more than 
the intersection of the past light-cones. So they formulate a condition very like my 

^^Cf. my (1989); especially pp. 122-130 for PPSI in general, and pp. 135-144 for the application to 
the Bell experiment, i.e. desiderata (i)-(iii). My (1992, pp. 74-76) is a summary. 
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PPSI, and then say that outcome dependence violates it; (Uffink 1999, pp. S523-S524; 
Henson 2005, pp. 516-533). 

But as I said in Section [T], the main development I need to address is the Budapest 
school's work objecting to the folklore that outcome dependence violates PCC, on the 
grounds that PCC does not require different correlations to have the same common 
cause (a 'common common cause'). I will discuss this work (Section 13.2.11 to 13.2.41) . 
but then maintain, following "replies" by Placek and the Bern school, that the folklore 
is right: outcome dependence violates PCC (Section 13.2.51 and 13.2.61) . (So much of 
Sections 13.2.11 to 13.2.61 is expository.) Then in Section [231 I will carry the discussion 
over to PPSI and SEL@ 

3.2.1 Resuscitating the PCC 

The Budapest school provides two kinds of resuscitation of PCC: formal, and physical. 
The formal resuscitation consists of rigorous theorems that under certain conditions, 
common cause events — technically: events in a probability space that screen off — must 
exist. There are such theorems, both for classical and for quantum probability spaces; 
and also in the framework of AQFT. But here I only need a rough statement of one 
main theorem for classical probability. The idea is to proceed in two steps. 

(1) : Given a probability space, say S, and a pair of correlated events E,F (so 
E, F G S), S may well not contain a common cause in Reichenbach's sense. But one 
can build another probability space S, with the features: (i) S can be mapped one-to- 
one into S, while preserving both the algebraic (set-theoretic) relations between, and 
the probabilities of, events; call this map h; (ii) S contains a common cause of h{E) 
and h{F). In fact, S is built from the disjoint union of two copies of the given space 
S. 

(2) : Given any finite set of pairs of correlated events in a probability space, one can 
pick one pair after another, iterating the construction in step (1); and so conclude by 
induction that for any such finite set of pairs of correlated events, there is a probability 

^^For my purposes in this Section, the main references for the Budapest school are: Hofer-Szabo 
et al. (1999, 2002), Szabo (2000), Redei (2002) and Hofer-Szabo (2007); and the main references for 
the Bern school are: Grasshoff et al. (2005) and Portmann and Wiithrich (2007). Placek is of course 
a leader of the (equally prolific!) "Pittsburgh-Krakow school": they develop a rigorous framework 
combining modality (indeterminism, "branching spacetime"), events' spacetime locations, and their 
probabilities — and then use this framework to analyse the Bell experiment. This framework is fruitful: 
for example, its branching structure secures an algebraic (Galois connection) definition of the different 
outcomes of a stochastic event, and a theorem that they form a Boolean algebra (Kowalski and Placek 
1999, Section 2). It also provides rigorous formulations of: (i) two incompatible measurements or 
outcomes occurring in the same spacetime region; (ii) the PCC in both weak and strong versions; 
and (iii) Bell theorems. Important recent references include: Placek 2002, 2004, Belnap 2005, Miiller 
2005. Besides, SEL could surely be formulated in it (even more rigorously than in Section [2.3.1l ). so 
as to assess whether SEL is violated by outcome dependence. But I will duck out of this project, since 
the framework is so rigorous as to make it a considerable effort: and I wager that it would make no 
difference to my conclusions. In particular, we will see that Placek's defence of a common common 
cause can be separated from the framework. 
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space which for each pair contains a Reichenbachian common cause 



The physical resuscitation is based on distinguishing, for situations with more than 
one pair of correlated events, a weak PCC and a strong one. The weak PCC requires 
only that for each pair of correlated events, there is a screener-off, with different pairs 
in general having different screeners-off. This weak version seems to have been Re- 
ichenbach's main idea. In any case, it is certainly this weak version that is vindicated 
by the theorem just reported: for n such pairs, each of step (2)'s n iterations of (l)'s 
construction will define a new event (a subset of the new disjoint union) as the common 
cause of the pair being addressed at that iteration. On the other hand, the strong PCC 
requires that all pairs have the same screener-off: a common common cause. So the 
strong PCC is not vindicated by the above theorem!^ 

Let us first make this distinction more precise, while also generalizing to allow the 
common cause to be, not a dichotomic event that either occurs or not (C or -iC), 
but — as needed for the Bell experiment — the value of a variable, with maybe more 
than two values. On analogy with eq. 13.41 et seq., and Reichenbach's own requirement 
that both the common cause C and its negation -iC screen off E and F, we shall 
require stochastic independence at each value of the variable, A say, in its range A. 
So in discussing common causes for a general probability space S, A corresponds to a 
partition of S", and a value A e A to a cell. Of course, an example of such a space S 
appropriate to the Bell experiment is the "big space" whose elements are quintuples 
< A, X, X, F >, mentioned in Section I3.1.li^l 

Then a weak PCC, and a strong PCC, can be stated as follows. Let {Em}-, {Em} be 
two sets of events in a probability space 5*. We say the set of (ordered) pairs {Em-, Em) 
is correlated a for some (maybe all) values of m, pr{Em^Em) 7^ pr{Em) ■pr{Em)- Then 
the weak PCC states: for each value of m for which there is a correlation, there is a 
partition, A"^ say, of the probability space S (i.e. there is a common cause variable) 

"'^^For details of (1) and (2), cf. Proposition 2 of Hofcr-Szabo ct al. (1999). The Pittsburgh-Krakow 
school has a theorem with a similar flavour; cf. Placek (2000,pp. 456-459 ; 2000a, pp. 176-178; 2002, 
pp. 333-334). The Budapest school also proves a restrictive necessary condition on the probabilities 
of two pairs of correlated events, for them to have a common common cause — even in an extension of 
S of the given space S\ (Hofer-Szabo et al. (2002), Proposition 4). This restriction prompts the next 
paragraph's discussion of the strong PCC. 

^*Nor could it be, in view of the necessary condition mentioned in the previous footnote. Nor, 
incidentally, is the strong PCC vindicated by the theorem's quantum analogue; to check this, cf. cq 
(36) et seq. of Proposition 3 of Hofer-Szabo et al. (1999). But I shall not need to consider the 
Budapest school's quantum version of the PCC, in either weak or strong versions: which it would be 
worthwhile to compare with Benson's version (2005, pp. 534-536). Nor, incidentally, need I consider 
Uffink's (convincing!) proposal for how to extend the PCC to multiple events (1999, pp. S517-S520). 

^^On the other hand, we can keep things simple by taking the correlated events themselves always 
to be dichotomic; (essentially because the quantities at issue in the Bell experiment are two- valued). 
A note on jargon: the Budapest school calls cases where the common cause is a cell of a partition 
(value of a variable) rather than a single event 'common cause systems'; and so a strong PCC requires 
a 'common common cause system'. Cf. Hofer-Szabo (2007, Sections 1 and 2) for a review of the 
definitions, and references. 
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such that for every value (cell) \^ of A 



(3.8) 



(We assume that the relevant conditional probabilities are non-zero; cf. footnote 6. 
The range of /, i.e. the number of cells in the partition, may of course depend on m.) 
Again: the stochastic independence at each value (cell) generalizes Reichenbach's 
requiring screening-off both by C and by -iC. 

On the other hand, the strong PCC reverses the quantifiers to give a stronger 'there 
exists, for all' statement: viz., there is a partition (a common cause variable). A, of 
the probability space S such that, for each value of m for which the pair {Em, Fm) is 
correlated, and for every value A of A: 



Clearly, this single partition A of 5* deserves the name 'common common cause' (or 
'common common cause variable'). 

Since the Bell experiment involves several pairs of correlated events, this distinc- 
tion applies to it. (For most quantum states and measurement choices: all four joint 
measurement choices give a correlation, so that since all four quantities are two- valued, 
there are four pairs of correlated events.) The Budapest school now claims that proofs 
of a Bell inequality seem to always use a strong version of PCC; i.e. to invoke a common 
common cause, along the lines of eq. I3.9ll^ If so, then the violation of the Bell inequal- 
ity impugns at worst the strong PCC — not the weak one, nor therefore Reichenbach's 
original formulation. 

This last claim prompts five discussions, taken up in the following Subsubsections. 
The first and third support the claim; the second clarifies it by stating two distinctions 
about how to think of the space A. Then the fourth and fifth reply to the claim, in two 
ways. These replies follow Placek and the Bern school, respectively; (adding to their 
discussions only the deployment of the two distinctions). Then with these discussions 
about PCC in hand. Section [3^ will be able to make quick work of SEL. 

3.2.2 Known proofs of a Bell inequality need a strong PCC 

First, the claim is plausible, when we consider how known proofs of a Bell inequality 
proceed: compare eq. 13.91 with eq. 13.41 to More generally, we notice that: 

(a) : Bell inequalities are linear inequalities involving joint probabilities (and in 
some versions, also: single probabilities) of outcomes. 

(b) : A hidden variable model of the experiment represents each of these probabilities 
as a weighted average (convex combination) of "hidden" conditional probabilities for 
the joint (or single) outcome, viz. the probabilities conditional on each value of A. 

^"This point was first made by Belnap and Szabo (1996); cf. also e.g. Hofer-Szabo, G. et al. (1999), 
p. 388. 



(3.9) 
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(c): In known derivations of the Bell inequality, it is crucial that the same variable 
A, and so the same weights pr{X), be used in each term of the inequality; and that the 
model is "local" in that this A provides a strong PCC along the lines of eq. 13. 9[ For 
it is only with these assumptions that we can argue: first, that the joint (and maybe 
single) probabilities conditional on each value of A obey an inequality (as a matter of 
arithmetic); and then that averaging over the values of A — and so going from hidden 
conditional probabilities to observable probabilities — preserves the inequality. 

One can also support the claim, other than by surveying proofs. Namely, one can 
try to build a hidden variable model of the Bell experiment that is empirically adequate 
(i.e. delivers the correct quantum correlations) and uses only the weak PCC, and is 
otherwise physically reasonable. Here one naturally takes 'physically reasonable' to 
mean analogues of Section I3.1.1f s conditions. Thus one seeks a model in which for 
each postulated common cause variable, A"^: 

(i) appropriate versions of both parameter independence and outcome independence 
hold good; and 

(ii) A™ is not correlated with the choice of measurement. 

In Section I3.2.4[ I will describe how Szabo (2000) builds such a model; (he adapts 
the disjoint union construction reported in (1) at the start of Section 13.2.11) . But 
the exact formulation of conditions (i) and (ii) needs some care. For the formulation 
depends on: 

(Many-Big): the distinction mentioned in Section [3.1.1^ between the many spaces 
and the big space approach to formulating Bell's theorem; 

(Only- Total): the distinction mentioned in Section [3.1.21 about whether or not A 
encodes only factors that are causally relevant to the outcomes. 

These distinctions will also play a role in our eventual reply to the Budapest school's 
claim. So I turn to filling out these distinctions, (Many-Big) and (Only- Total). 

3.2.3 Two distinctions 

3.2.3.1 The distinctions stated As to (Many-Big), we saw in Section [3. 1.11 that: 

(i) : The many spaces approach postulates in the first instance a space A of 
hidden variables; one may then build other spaces from that, e.g. by taking a Cartesian 
product. On the other hand; 

(ii) : The big space approach takes a value A to be an event (i.e. subset not an 
individual element) in a larger space that also has measurement choices and outcomes 
as events. 

Then in Section [3.1.2l we stated (implicitly) another distinction, (Only- Total), which 
cuts across (Many-Big). Namely, the distinction whether 

(i) : A encodes some kind of intrinsic state of the particle-pair at the time of 
emission, or maybe a bit later; (but certainly before they enter the apparatuses, so 
that the distribution p can be independent of the measurement choices x and y); or 

(ii) : A encodes the total physical state of a spacetime region (maybe with sub- 
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regions lying to the future of the emission event); A is thereby myriadly complex and no 
doubt contains many features that are causally, indeed stochastically, irrelevant to the 
outcomes. (As we saw, this was the option taken by SEL and PPSI, though without 
the A notation.) 

This distinction (Only- Total) cuts across the first one (Many-Big), since, for ex- 
ample, the total physical state in (ii) of (Only- Total) need not encode anything about 
measurement choices and outcomes. Indeed, it should not do so, so as to better support 
Section I3.1.1f s "locality" assumptions of a Bell's theorem. 

In Section 13.2.11 we thought of a common cause variable as a partition A™ of the 
probability space S, with a common cause event A™ being a cell of the partition (a 
value of the variable). Since in general a cell is not a singleton set, there will be more 
than one way a given value A[™ can be realized. Clearly, this can be motivated by 
adopting the second option, (ii), for each (or both) of the distinctions (Many-Big) and 
(Only- Total). 

As to (ii) of (Many-Big): if the elements of the probability space S encode mea- 
surement choices and outcomes, then there will certainly be more than one way that 
a given value A™ can be realized. For the simple space of quintuples < A, x, y,X,Y > 
with which I introduced the big space approach, there will be 4 x 4 = 16 ways, since 
all four joint measurement choices can have four results. 

As to (ii) of (Only- Total): if A encodes the total physical state of a spacetime region, 
it will be very logically strong ("rich") and will in general include countless features 
that are causally, indeed stochastically, irrelevant to the outcome. So there will be 
countless ways that a given value of a common cause variable AJ" can be realized. Note 
here that we firmly distinguish A™, a cell of a partition of S, from this rich A, which 
is: either an element of S* = A (viz. on the many spaces approach), or a component of 
such an element (viz. on the big space approach). 

With these two distinctions in hand, I can now make good the promise at the end 
of Section 13.2.21 to formulate some appropriate versions of parameter independence, 
outcome independence, and the condition that the common cause is not correlated 
with the choice of measurement. 

3.2.3.2 Formulating locality conditions Though it would be a good exercise to 
write such formulations for all four cases arising from the two distinctions, I shall here 
consider just one case, viz. the many spaces approach, and take A to encode a total 
physical state of a region; (i.e. options (i) of (Many-Big) and (ii) of (Only-Total)). We 
shall also see in Section [3.2.41 that Szabo treats another of the four cases: he adopts the 
big space approach, and (implicitly) option (i) of (Only- Total). These two cases — ours 
here, and Szabo's in Section [3.2.4l — will be enough for our goals, namely replying to the 
Budapest school's claim about Bell theorems (Sections 13.2.61 and 13.2.5( 1: and arguing 
that SEL fails in the Bell experiment (Section 13.31) . 

So let us for now adopt the many spaces approach, and take A to encode a total 
physical state of a region. Then S = A 3 X and AJ" is a cell of a partition A™ of S. 



Let us consider in order, formulating: 

(i) parameter independence, 

(ii) outcome independence, 

(iii) putting these together: factorizabihty; and 

(iv) the common cause being independent of the measurement choice. 
We will see that for each of these, we have a choice: we can either: 

(a) retain Section I3.1.1f s original formulations, or 

(b) incorporate a reference to a (in general: non-common) common cause A™. 

On either option, we get an equivalence between a conjunction of (i) and (ii), with (iii). 
And both options will be relevant to our reply to the Budapest school: the first option 
will be taken up in Section 13.2.51 the second in Section 13.2.61 

(i) : Parameter independence: — Option (a): We can repeat eq. 13.51 word for word. 
That is, we can say that each "rich micro-state" A screens off a nearby outcome and a 
distant setting, without mentioning common cause variables A™". Thus, we repeat eq. 
13.51 for L outcomes: 

VA; X, y;X,Y = ±l: prx^X) = prx,.,y{X) := prx,.,y{XkY) + prx,.^y{Xk^Y) . 

(3.10) 

Or Option (b): we postulate that there are probability functions pTxiPTyiP'rx,y, and 
require that for each joint setting m =< x,y >=< ai,bi >,...,< 02,62 >, there is 
a partition A"^ of A, of which each cell AJ" screens off each nearby outcome and its 
distant setting. Besides, since A"^ C A is an event, we express this screening-off with 
a conditional probability, not a subscript — despite having adopted the many spaces 
approach. That is, we require that pTx^pry^pr^^y satisfy: 

Vm =< > = < ai, 61 >,...,< 02,62 >; 3A™, V cells A J"; VX, F = ±1 : (3.11) 
prx{X/\T) =prx,y{X/\T) ■.= prx,y{XkY/\T) + pr^^yiXk^Y / Xf) ; 

(and similarly for R outcome probabilities). 

Neither formulation implies the other. Obviously, eq. I3.1UI does not imply eq. 
13.111 since "coarse-graining" ("losing information") can destroy screening-off. (To be 
precise: if we read eq. 13.111 as having a fixed partition, then eq. 13.101 does not imply 
eq. 13.111 Of course, eq. 13.101 does imply eq. 13.111 read as asserting that there is some 
such partition: for eq. 13.101 is eq. 13.111 with the finest possible partition, given by the 
singletons of all the elements A G A.) 

Conversely, eq. 13.111 does not imply eq. 13.101 since "fine-graining" i.e. condition- 
alizing can produce stochastic dependence ('Simpson's paradox'; cf. (1) at the end of 
Section EXg). 

(ii) : Outcome independence: — The situation is entirely parallel. Option (a): We 
can repeat eq. 13. 6[ word for word, saying that each A and joint setting screens off the 
two outcomes, without mentioning common cause variables A™. Thus, we repeat eq. 

m 

VA, X, y-X,Y = ±l: pr^^^^yiXkY) = prx,x,y{X) ■ m,.,y(>^) • (3.12) 



Or Option (b): We can require that for each joint setting m =< x,y >, there is a 
partition A™ of A, of which each cell A[" screens off the two outcomes. And as in eq. 
13.111 we express the screening-off by conditionalizing on the event — despite having 
adopted the many spaces approach. That is, we require: 

Vm =< x,y > = < ai, 61 >,...,< 02,62 >; 3A™, V cells AJ"; VX, F = ±1 : (3.13) 

pr^,y{XLY/XT)=pr,,y{X/XT)-pr,,y{Y/Xr) . 

As with parameter independence in (i), neither formulation implies the other — and 
for parallel reasons: both "coarse-graining" and "fine-graining" can destroy stochastic 
independence. We will return to the converse non-implication — that "fine-graining" 
can destroy screening-off, i.e. Simpson's paradox — in Section 13.2.51 

(iii) : Factorizability: — Recall from Section [3.1.11 that eq. 13.41 is equivalent to the 
conjunction of eq.s l3.5l and l3.6[ The equivalence generalizes easily to the coarse-grained 
versions considered here, provided each measurement choice determines the same par- 
tition for all three conditions. That is to say: one easily checks that coarse-grained 
factorizability, viz. 

Vm =< X,?/ > = < ai, 61 >,...,< a2,&2 >; SA*", V cells A^; VX, F = ±1 : (3.14) 

pr^,y{XLY/XT)=pr,{X/XT)-pry{Y/Xr) . 

is equivalent, not to the conjunction of eq.s 13.111 and 13.131 as they stand, each saying 
'there is a partition A™': but to the stronger statement with the conjunction in the 
scope of the existential quantification: i.e. the statement that for any m, there is a 
(single) partition A"* all of whose cells do both sorts of screening off — outcomes from 
distant settings, as in eq. 13.111 and outcomes from each other as in eq. I3.13[ 

(iv) : Independence of measurement choice: — We have a corresponding choice about 
how to formulate the idea that the common cause is independent of the measurement 
choice. First, Option (a): we can take 'common cause' as in Section [3. 1.11 to mean the 
"rich micro-state" A G A: then the independence is expressed exactly as before, by the 
postulation of a fixed probability measure p on A (rather than Px,y)- 

Or Option (b): we can take 'common cause' to mean a cell (event) A[^ of a partition 
A'" defined by a measurement choice. In that case, there is a sense in which the 
common cause is trivially dependent of the measurement-choice. Namely: different 
values of m = < x,y > will in general define different events A[" (even if we pick the 
same cell label / =< ±1, ±1 > out of the four possibilities), with in general different 
probabilities — despite there being a fixed measure p@ 

Note that if in each of (i) to (iv) we take Option (a), i.e. we retain the formulation 
given in Section 13.1.11 then we will be committed to a Bell inequality: despite our 
education in Budapest, so to speak — that in general, common causes are not common 

^^But as I mentioned at the end of Section r3.2.2[ Szabo's model will obey a substantive condition 
requiring a non-common common cause to be stochastically independent of the measurement choice. 
For Szabo adopts the big space approach, and so requires an analogue of Section I3.1.1f s eq. 13.31 



^0 



common causes. We will return to this in Section I3.2.5f s reply to Budapest, and in 
Section 13.31 

On the other hand, if in each of (i) to (iv) we take Option (b), i.e. we assume 
only m-dependent common cause variables, then the Budapest school's lessons (from 
Section 13.2.21 and to come, from Section I3.2.4p apply. That is: from these lessons, it 
seems that a Bell inequality cannot be derived. But beware: we will return to this 
option in Section I3.2.6[ 

So much about how to formulate locality conditions when we adopt option (i) of 
(Many-Big) and option (ii) of (Only- Total). I turn to Szabo's many- common- cause 
model of Bell correlations, promised at the end of Section 13.2.21 

3.2.4 Szabo's model 

Szabo's model adopts the second option of (Many-Big). That is, he adopts a big space 
approach. But his space is considerably more complex than the space of quintuples 

< A, X, y,X,Y > with which I introduced this approach. For he constructs his model's 
common cause events by adapting Hofer-Szabo, G. et al. (1999) 's iterated construction 
that takes a disjoint union of two copies of the probability space; (cf. (1) and (2) in 
Section [Ml])- 

For the sake of completeness, I should add about the distinction (Only- Total), that 
Szabo is of course aware of it: his Section 14 sketches option (ii) of (Only- Total), in 
which A encodes a total physical state. But he does not adopt this option. In fact, for 
each of the four correlations, the common cause that Szabo constructs at each stage is 
one of the two copies of the given probability space, i.e. one of the "unionands" of the 
disjoint union: details below. 

I can now state the locality conditions obeyed by Szabo's model, with its big space 
approach. I shall present them in the order in which Section 13.1.11 introduced them, 
adding a reference to Szabo's equations; (our notations are similar). 

The first point to make concerns the probabilities of the elements of his probability 
space (the atoms of his event algebra). As Szabo himself stresses (2000, Sections 13- 
14), these probabilities depend on the probabilities of measurement choices; contra 
our subscript- less p in Section [3.1.11 But (as I announced in footnote 11) this is not 
suspicious: it in no sense suggests a "conspiracy" . For on Szabo's big space approach, 
each element encodes the entire experiment; or in logical language: it represents a 
truth- value assignment to propositions reporting choices and outcomes, as well as those 
reporting the values of common cause variables, i.e. which common cause events occur. 
In short, this dependence is innocuous: just as it is for the probabilities of the quintuples 

< A, X, y,X,Y > in my "baby" example of the big space approach. 

This means that our requirement that the measurement choices be independent 
of the hidden variable A (our eq. 13.31) goes over in Szabo's model to a requirement 
that choices be stochastically independent of the common cause events; where each 



correlation, labelled {X,Y) {X = Ai,A2;Y = Bi, B2) has its own common cause 
event, written Zxy- So Szabo's eq. (13) is: 

p{xkZxY) = p{x)p{Zxy) ; piySzZxv) = p{y)p{ZxY) ■ (3.15) 

Parameter independence (our eq. 13. 5p is expressed by the stochastic independence 
of a nearby outcome and a distant setting. Thus Szabo's eq. (12) is: 

p{Xky) =p{X)p{y) ; p{Ykx) = p{Y)p{x) . (3.16) 

Outcome independence (our eq. 13. 6p is now weakened to allow each correlation 
{X,Y) to have its own common cause event Zxy- But corresponding to eq. I3.6f s 
requirement that all values of A screen-off the outcomes, Szabo requires that -iZxy, as 
well as Zxy, screens off. So Szabo's eq.s (17), (18) are: 

piXkY/ZxY) =p{X/Zxy)p{Y/Zxy) ; p{XLY/^Zxy) = p{X/^Zxy)p{Y/^Zxy). 

(3.17) 

Finally, Szabo requires that the measurement choices in each wing are independent 
of each other. Thus his eq. (11) is 

p{xky) = p{x)p{y) ^ (3.18) 

So much for Szabo's requirements on his models. I turn to a sketch of how he 
constructs his common causes, so as to make vivid how each correlation gets a different 
common cause. As I mentioned, he iterates the disjoint union construction four times, 
in each case constructing a common cause. In fact, the common cause is in each case 
just one of the two copies of the given space. Thus he starts with a probability space 
Q with 16 elements ui,...,uiq: and he identifies various sets of the us with the four 
measurement choices and the four outcomes. Then he considers two copies of this space: 
fli := {{u,i) : u G = 1,2, and the 32-element probability space Q' := Qi U ^2. 

He shows that he can define the common cause event of one of the correlations, say 
{Ai, Bi), as one of the copies, say fli. Then at the next stage he takes two copies Q[ of 
Q', and defines the 64-element probability space as the disjoint union Q" := Q[U 
Now he defines the common cause event of a second of the correlations, say {Ai,B2), 
as one of the copies, say Q[, considered as a subset of Q". And so on, up to defining his 
final 256-element probability space Q"" := Q'C U Q2, and taking the common cause of 
the fourth correlation, say {A2, B2), as This makes it vivid how the four common 
cause events are distinct: they have, respectively, 16, 32, 64 and 128 atoms! 

To sum up: Szabo's model is impressive, both in its iterated technical construction, 
and in the range of requirements (eq. 13.151 to 13.181) it satisfies. He is surely right to 

^^Though I said 'Finally', Szabo also imposes some other requirements on common causes, inspired 
by the Reichenbach tradition; cf. his eq.s (14), (16), (19). For example, they concern whether common 
causes promote or inhibit; (cf. (ii) in footnote 8). But we can ignore these extra requirements. 
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conclude that the 'model satisfies all locality conditions [that have] been required in 
the EPR-Bell hterature'. 

But there is a catch, duly acknowledged by Szabo. Though each common cause 
event is independent of the measurement choices (eq. I3.15p . various Boolean combina- 
tions of these events, such as Za^Bi H ^AiS2 cind Za^Bi U ZaiB2 correlated with the 
measurement choices. This leads to the Bern school's "reply" to the Budapest school: 
cf. Section I3.2.6I But first we develop another, less technical, reply. 

3.2.5 A common common cause is plausible 

A good case can be made that in the Bell experiment, it is physically reasonable to 
postulate a common common cause. That is: a good case can be made for Section 
I3.1f s folklore that what is "spooky" about Bell correlations is that outcome dependence 
violates the strong PCC I shall state this case; and then admit that, as the jokes say, 
there is bad news as well as good news@ 

The main idea why a common common cause is reasonable is straightforward. Any 
two runs of a (well-designed!) Bell experiment have in common a great many features 
that are obviously relevant to the outcomes obtained (and resulting correlations) apart 
from the measurement choices: the preparation of the source and the two channels 
to the wings, the preparation of the detectors apart from their settings, the quantum 
state of the emitted particle-pair. (Agreed, any two runs will also differ in countless 
microscopic details about (a) these features and (b) other respects.) So it is reasonable 
to take these common features to define a hidden variable A which is common for the 
different measurement choices. 

This case can be made stronger (or at least: more precise!) by invoking the as- 
sumption that we can arrange that the measurement choices are: 

(i) : stochastically independent of these common features (i.e. of any common cause 
event); 

(ii) : made after these features are settled. 

Again, this assumption is reasonable. For (i) is accepted not just by the usual models 
(e.g. eq. 13. Sp . but also by Szabo's model (eq. I3.15p . And (ii) is reasonable in the 
light of the Aspect experiment, i.e. the fact that we can arrange each measurement 
choice to be spacelike to a finite, if small, region comprising the "summit" of the past 
light-cone of the emission event. 

If we accept this assumption, then having the hidden variable (the common fea- 
tures) be different for different measurement choices seems to be backward causation. 

^^I think that before the Budapest school's critique, I and most other "folk" had this case for 
common common causes "unconsciously in mind"; and more recently, Uffink (1999, pp. S517, S523) 
and Henson (2005, pp. 527, 530) surely have it in mind. But all credit to Placek. who was, so far 
as I know, the first person to articulate this sort of case, in the aftermath of the Budapest school's 
critique (2000, pp. 464-465; 2000a p. 185, 187). I also think that both the good and the bad news 
should be uncontroversial; for example, Grasshoff et al. (2005, p. 678) endorse both. 



This case is also illustrated by the discussion in Section I3.2.3.2[ There, we con- 
sidered two cross-cutting distinctions, (Many-Big) and (Only- Total). We saw that if 
we adopt the many spaces approach, and take A to encode a total physical state of a 
region, then we can still state locality conditions exactly as we did originally, in Sec- 
tion I3.1.1[ (This was Option (a) in Section I3.2.3.2j for example, cf. eq. 13.101 13.12[ ) 
These conditions lead to a Bell inequality, "despite our education in Budapest" . This 
illustrates the present case for a common common cause, since one way to conceive 
such a cause (i.e. to define A) is as the total physical state of an appropriate region. 

I announced that there was also bad news. Namely, this case for a common common 
cause is obviously not conclusive. For my purposes, it will be enough to formulate the 
issue while taking the common common cause to be the total physical state of the 
common past; (cf. PPSI in Section I3.1.2p . Recall that this means assuming that, 
once we divide the total physical state of the common past into events that do, and 
those that do not, causally affect the outcomes: conditionalizing on the latter does not 
disrupt the stochastic independence induced by conditionalizing on the former. (This 
was comment (1) at the end of Section [3.1.21 ) But it seems that which events within 
the common past of the measurement events are causally relevant to the outcomes 
could vary with the measurement choice. And it (perhaps!) seems that this could 
be so, without there being any "conspiracy" or backwards causation. And if that is 
right, then which features of the total state of the common past (i.e. which partitions 
of the space of such states) define screeners-off — i.e. which common cause variables 
exist — would no doubt vary with the measurement choice: so that there is no common 
common cause. 

This issue takes us back again to the discussion at the end of Section [3.2.3.2t specif- 
ically, to the logical independence (mutual non-implication) of the "fine-grained" and 
"coarse-grained" formulations of parameter independence and outcome independence. 
Namely: if one endorses the previous paragraph's 'seems' statements, then one will 
endorse the coarse-grained formulations of these conditions, which allowed different 
common causes for different measurement choices (i.e. eq. 13.111 and I3.13p . And be- 
cause these formulations do not imply the original fine-grained formulations (eq. 13.101 
13.121 repeating eq. 13.51 13.61) . it seems, on the Budapest school's evidence in Sections 
13.2.21 and 13.2.41 that one will not be committed to a Bell inequality. But not all is as 
it seems: cf. Section [3.2.61 

3.2.6 Bell inequalities from a vi^eak PCC: the Bern school 

The Budapest school's survey of known proofs and Szabo's model (Sections 13.2.21 and 
13.2.41) do not conclusively establish that any Bell's theorem needs a common common 
cause. Perhaps there are some plausible assumptions, not involving a strong PCC, 
that imply a Bell inequality. Recent work — two papers by Grasshoff, Portmann and 
Wiithrich, and one by Hofer-Szabo — shows that indeed there are. 

In brief, the situation is as follows. Grasshoff et al. (2005) proved a Bell inequality 



(viz. the Bell-Wigner inequality) from a weak PCC, other locahty and no-conspiracy 
assumtpions, and an assumption of perfect anti-correlation for parallel settings. Hofer- 
Szabo (2007) develops this line, both "negatively" and "positively". Negatively, he 
shows that, mainly because of the assumption of perfect anti-correlation, these as- 
sumptions entail a strong PCC. (He constructs a common common cause by conjoining 
Grasshoff et al.'s separate common causes: cf. his equation (45).) But positively, he 
also shows, by a continuity argument, how to derive an analogue of the Bell-Wigner 
inequality from a weak PCC, without perfect anti-correlation. (The analogue adds a 
"correction term" to the Bell-Wigner inequality; equation (61).) Finally, Portmann 
and Wiithrich (2007) prove a theorem similar to Hofer-Szabo's. But broadly speaking, 
their theorem is stronger in that their proof is constructive, rather than a continuity 
argument; and their derived inequality is an analogue of the Clauser-Horne inequality, 
again with correction terms (their equation (52)). 

I will advertise this line of work by giving a few more details about the first pa- 
per. Grasshoff et al. assume the weak PCC, eq. 13.81 (their PCC, p. 669). They also 
assume there is no correlation between conjunctions of common causes and measure- 
ment choices: that is unsurprising, since they obviously need some such assumption to 
block the availability of Szabo's model. But the main assumption that enables them 
to overcome the logical weakness of allowing different common causes is perfect anti- 
correlation: if bi is parallel to ai, then pr{ai = +I/&1 = —1) = 1 etc. They also of 
course make locality assumptions. But I will not go in to detail about these, since 
their formulations are rather different from those of Section I3.1.H and from Section 
I3.2.3.2f s coarse-grained versions (i.e. using partitions A™). The difference is not just 
that Grasshoff et al. use the big space approach; they also formulate the assumptions 
in the context of their related work on (a regularity theory of) causation. So it is a 
good question how to rewrite their theorem in terms of Section I3.2.3.2f s conditions, or 
analogues: but a non-trivial question, and so not for this papero 

Grasshoff et al. point out (2005, pp. 677) that if perfect anti-correlation were 
indeed necessary for the derivation, then since it cannot be completely verified (mea- 
sured correlations are never perfect), one could presumably build a Szabo-esque model 
incorporating a small deviation from perfect anti-correlation — which would apparently 
be impossible to refute in the laboratory. 

But it turns out not to be necessary, as shown by the two later papers just cited. 
(And even if it were, it is a very reasonable assumption: for though not completely 
verifiable, its theoretical warrant is very strong — it arises from the conservation of 
angular momentum.) So the question is, as usual: which assumption of the derivation 
is the culprit? Though these authors do not say so explicitly, their discussions suggest 

^^Suffice it to say that: (i) their theorem turns on an analogue (their Result 2, p. 673) of the 
standard argument by which perfect anti-correlation reduces a factorizable stochastic model to a 
deterministic model (e.g. Redhead 1987, pp. 101-102); and (ii) this standard argument carries over 
to Section r3.2.3.2[ s coarse-grained versions of factorizability etc. — each choice m =< x,y > with x 
parallel to y defines a partition A^^'^^ each of whose cells is deterministic, i.e. conditioning pr^ and 
pvy on any such cell makes probabilities trivial (0 or 1). 

■^7 



they believe the culprit is the weak PCC, eq. 13.81 I of course concur; (cf. Section 
I3.1.1f s discussion setting aside interpretations and loopholes). And although I have 
ducked out of rewriting their theorems in terms of Section I3.2.3.2f s conditions, I wager 
that in such a re-writing the culprit will be our coarse-grained outcome independence, 
eq. I3.13[ So: at least for this paper's purposes, I conclude from these theorems, 
that the experimental violation of the Bell inequalities impugns the weak PCC, and 
more specifically, coarse-grained outcome independence. Section 13.31 will develop the 
consequences of this situation for SEL. 

To briefly summarize this long Subsection: I have endorsed the cases made by 
Placek and the Bern school, that outcome dependence does impugn the PCC. Placek 
targeted the strong PCC; and the Bern school targeted the weak PCC. Now I turn to 
carrying over these arguments to SEL. 

3.3 SEL in the Bell experiment 

As always when one relates philosophers' conditions like PCC and SEL to an experi- 
ment, or its technical physical description, one has to exercise judgment about (i) how 
rigorous to be, and (ii) how well the philosophical and physical concepts mesh. 

As to (i), this paper has obviously not aimed for rigour. In particular, I ducked out 
of the natural project in connection with the Pittsburgh-Krakow school, viz. formal- 
izing and assessing SEL in their framework, since I think it would make no difference 
to my conclusions (cf. footnote 16). Sufficient unto the day is the trouble thereof! 

As to (ii), we have already seen various such judgments. For example, at the end of 
Section I3.L1I we discussed assumptions such as taking outcomes to be distinct events, 
and denying the detector efficiency and locality loopholes. Only by endorsing these can 
we get the verdict that the Bell experiment proves outcome dependence. Judgments 
were also involved in taking outcome dependence to show that PCC (or PPSI or Bell's 
local causality) is refuted: primarily in Sections I3.1.2[ I3.2.3[ 13.2.61 and 13.2.51 

I submit that it is clear enough that these discussions carry over to SEL; so that 
it would be very repetitive to treat them seriatim, bringing in, again seriatim, the 
three formulations from Section 12.3.11 especially the two formulations of SELD, and 
the comparions with PCC and PPSI from Sections 12.3.21 and I3.1.2[ 

Instead, I propose just to summarize SEL's violation in the Bell experiment, under 
three points. The first two return us to the work reviewed in Section 13.1.21 and so 
ignore the Budapest school's distinction between the weak and strong PCC. That will 
prepare us for the third point, urging that this distinction does not affect the verdict 
that SEL is violated. 

3.3.0.1 PCC and SEL are connected by PPSI The details of my view, sum- 
marized in Section 13.1.21 that outcome dependence refutes PCC, went hand-in-hand 
with outcome dependence refuting SEL. For I argued that outcome dependence refutes 



a cousin of PCC, dubbed PPSI. Like Bell's local causality, this took total physical 
states of spacetime regions (histories) as screeners-off; (it adopted the second option, 
'Total', of Section r3.2.3[ s (Only- Total) distinction). Using total physical states, and 
choosing regions appropriately, meant that: 

(i) PPSI was satisfied by everyday counterexamples to PCC; 

(ii) PPSI justified parameter independence and outcome independence; and 

(iii) the other assumptions of Bell's theorem were plausible. 

The overall result was that PPSI's violation was a natural "diagnosis" for the 
"spookiness" of the Bell correlations. The formulation of PPSI, and the views (i)-(iii), 
underpinned my (1994) 's arguments that the Bell experiment violated its formulations 
of SEL, viz. Section EXUs SELS and SELD2. 

To this verdict, the present paper adds two points. The first, small, one is that 
Section 12.3. If s SELDl was not formulated in my 1994. But since it implies SELD2 
(Section 12.4.1.11) . the violation of SELD2 implies that of SELDl. The second, main, 
point is that this verdict is unaffected by our education in Budapest — cf. Section 
13.3.0.31 below. 

3.3.0.2 The need for other judgments But I should mention that this verdict 
required some judgments that in this paper I have so far not mentioned, since they do 
not relate to the issue of common common causes. These are judgments that I still 
endorse but which, I agree, can be denied. Though I need not repeat the details, I 
should signal that they are of two kinds. First, there is a judgment only about SEL; 
then there are judgments about probabilities, which concern PCC and SEL equally. 

(i) : One can formulate the broad idea of SEL (at the start of Section [2l2l) differently 
from Section I2.3f s three versions; and some such formulations are not violated by the 
Bell experiment. Indeed, Hellman's original paper had formulations of SEL analogous 
to our SELS and SELD2 (his (4) and (5); 1982, p. 466, 495-497). But (as I mentioned in 
footnote 5) he wanted to deny that the Bell experiment involved spacelike causation, or 
any other kind of "spookiness". So he hedged his formulations of SEL with provisos, 
so that they were not violated by the experiment; (nor by similar unscreenable-off 
correlations, for example in a toy-example of Hellman's about a "flashing" particle). 
So deploying my formulations of SEL, rather than Hellman's with his provisos, involves 
a judgment: the outcome dependence in the Bell experiment, and Hellman's flashing 
particle example, are "spooky". More precisely, they are spooky enough that one 
should prefer a formulation of SEL that they violate. (For details of this judgment, 
and the provisos and example, cf. my (1994, Sections 6 and 7).) 

(ii) : Finally, my previous verdicts, both for PCC and SEL, needed some judgments 
comparing empirical and theoretical probabilities. These judgments are uncontrover- 
sial, and in no way special to my position: but they are worth articulating just because 
many discussions are silent about them. The sort of judgment needed is shown by [a] 
and [b]. 

[a]: The trio < A,x, y > of hidden variable and choices is: for PCC, an appro- 



priate common cause; or for SEL: an appropriate history H\ (and similarly for local 
causality). Or more exactly: though < X,x,y > may not itself be the appropriate 
common cause or history, a probability function determined hj < X,x,y > (viz. by 
some sort of averaging over other factors that should also contribute to the common 
cause or history) gives probabilities for outcomes that are close to those that would be 
given by PCC or SEL. Here, 'close' means of course 'sufficiently close' , as determined 
by [b]... 

[b]: The probabilities given by SEL (or PCC, or local causality) differ enough 
from the quantum (or better: experimentally right!) probabilities that outcome depen- 
dence in the experiment refutes the screening-off equation of, say, SEL. More exactly: 
outcome dependence refutes, in the first place, the screening-off equation with proba- 
bilities determined by < \,x,y >; then by [a], it refutes the screening-off equation of 
SEL or PCC or Bell's local causality. 

To sum up these two Paragraphs: I agree that taking the Bell experiment to vio- 
late SEL requires questionable judgments — which I have tried to articulate, here and 
especially in my (1994). 

3.3.0.3 Weak vs. strong SELD How if at all does our education in Budapest — 
i.e. the distinction between weak and strong PCC, or the question whether there is a 
common common cause — affect this verdict? I submit that it does not. 

The reason lies in two points from Section [3721 First: both PPSI and SEL sidestep 
the distinction by having a probability distribution prescribed by an individual world, 
or history, or region's total physical state. We saw this for SEL in Section 12.3.11 for 
PPSI in Comment (2b) at the end of Section [211121 and for locality conditions tailored 
to the Bell experiment in Option (a) of Section 13.2.3.21 Second: this feature of PPSI, 
SEL and Option (a) was defended in Section I3.2.5[ following Placek; (though with the 
concession that a Simpson's paradox threat might prompt resort to the Bern school's 
theorems: Section [3.2.6p . 

Of course, this is not to deny that one could introduce partitions A of the probability 
space into the formulation of PPSI and SEL, and so write down weak and strong 
versions of them. And agreed, if one does so: (i) the earlier formulations will correspond 
to strong versions (with the partition given by the singleton sets of each A); (ii) my 
verdict that PPSI and SEL are violated will hold good only for strong versions of 
SEL — again, as we would expect from Section 13. 2[ especially Szabo's model (Section 
[3:2:41) . 

For the sake of completeness, I should exhibit such strong formulations. For sim- 
plicity, I will again consider only the traditional Bell experiment with four correlated 
pairs {X,Y) = (A, 5i), (A2, ^2) (Section [SHI) • So I will not consider SELS. We 
envisage that the possible L-outcomes Ai,A2 occur in the same spacetime region, and 
similarly on the R-wing, so that we can specify a region as 'the common past of X and 
Y\ That is: there is for PPSI the usual ambiguity about what to mean by 'common 
past' — what if anything beyond C~{X) fl C~(Y) to include. But there is no further 
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ambiguity in the definitions of C {X) and C {Y). 

I sfiall also assume tliat for PPSI as well as SEL, we are to use partitions of the 
set W of worlds to encode the idea that only some features of the common past of X 
and y, or of history up to a hypersurface t, or of history within a "Table Mountain" 
region C"(t) fl C"{X), determine the probabilities of X and Y and their Boolean 
combinations. That is: worlds in the same cell of the partition match as regards these 
features, and so match as regards these probabilities. (We are of course back again 
at the idea that there is no 'Simpson event', conditionalizing on which would break 
the stochastic independence got by conditionalizing on a cell of the partition.) But I 
will not make my formulations of PPSI and SEL express explicitly the idea that the 
partition encodes features confined to a part of spacetime (viz. the common past of X 
and Y, or the past of a hypersurface t, or a "Table Mountain" region C~{t) nC~(X)): 
this would of course be a matter of demanding that within each cell of the partition, 
any possible history of the rest of spacetime is included in some element of the cell@ 

With these assumptions, a strong version of PPSI would be as follows: 

Strong PPSI: There is a partition A of the set W of worlds comprising 
a Bell experiment, with each cell A G A encoding features of the common 
past of the outcomes X = Ai,A2 and Y = Bi,B2, and such that for all 
outcomes X and Y: 

pr{XkY/\) = pr{X/X) ■ pr{Y/X) . (3.19) 

In effect, we have just combined Section I3.1.2f s PPSI, eq. 13. 7[ with Section 13.2. If s 
strong PCC, eq. 

Similarly for a strong version of SEL; with the difference that we can preserve Sec- 
tion 12.3. If s explicit indexing of probability distributions with either times or histories 
(in addition to worlds), and its universal quantification over these indices. This uni- 
versal quantifier 'for all times t/histories H\ must of course come before the existential 
quantifier, that there is a partition: which must itself come before the universal quanti- 
fier 'for all outcomes', in order that the formulation be strong. Thus we get for SELDl 
and SELD2: 

Strong SELDl: Given the set W of worlds comprising a Bell experiment 
with both outcomes X (respectively Y) occurring in the same spacetime 
region {X and Y spacelike): 

for any hypersurface t earlier than both X and Y, and dividing C~{X), 
and such that C-{X) n C-{Y) n C+{t) = 0: 

^^As mentioned in footnote 16, the Pittsburgh-Krakow framework makes rigorous sense of 'same 
spacetime region' etc., and so gives the materials for formulations of PPSI and SEL that express 
explicitly the partition being about only a part of spacetime. But this paper ducks out of developing 
the details. 
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there is a partition A of W, with each cell A G A encoding features of the 
past of t, such that for all outcomes X and Y 



prt{XkY/X)=prt{X/X)-prt{Y/X) . 



(3.20) 



Strong SELD2: Given the set W of worlds comprising a Bell experiment 
with both outcomes X (respectively Y) occurring in the same spacetime 
region {X and Y spacelike): 

for any hypersurface t earlier than X, and dividing C~{X), and such that 
Y is in the difference C-{t) - C-(X): 

there is a partition A of W, with each cell A G A encoding features of the 
history in the intersection C~{t) fl C~{X), such that for all outcomes X 



Note that though there is a time index in eq.s 13.201 and 13. 2H it is "formal" in 
the sense that prt cannot be interpreted as it was in Section 12.3. If s formulations: 
viz. as a probability distribution that incorporates all the ways that history up to t 
bears on probabilities of events future to t. For if pr^ incorporated all that, then the 
conditionalization on A in eq.s 13.201 and 13.211 would not be conditionalization on new 
information, that was not already incorporated in prt — while the idea of strong SELDl 
and SELD2 is that A should represent new information!^ 

With these formulations in hand, we can ask about their relations, as we did in 
Section 12.4. 1[ In considering this, we have to bear in mind that since the time index is 
now "formal", assumptions that in Section [2.4. II were compelling or plausible may well 
now not be. But for one implication, we are "lucky". It is easy to check that Section 
12. 4.1. If s proof that SELDl implies SELD2 carries over intact to the strong versions; 
(for it depends on relations between hypersurfaces, not at all on the interpretation 
of prt). That is: Strong SELDl implies Strong SELD2. On the other hand. Section 
I2.4.1.2f s proof that SELD2 implies SELDl, given two assumptions, falls down. For both 
assumptions depend on the interpretation of prf, (the first was that probabilities evolve 
by conditionalization on intervening history). Maybe there are plausible analogous 
assumptions that would give a proof. 

But I shall not pursue details. Space presses, and I want to close by glimpsing how 
SEL fares beyond quantum mechanics: in algebraic quantum field theory (Section |4]) 
and in dynamical spacetimes (Section [5]). My aim will be very modest: to advertise 
some recent work and open problems; and to save space, I will make no attempt to 
explain the relevant frameworks and formalisms. 

^^Indeed, if probabilities evolve by conditionalization on intervening history, eq. 12. 5i then this 
interpretation of prt would mean that any A is already cither implied or excluded by pr^, i.e. 
prt{X/\) = prt{X) etc., or prt{X) = 0. 
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prt{XkY/\)=prt{X/\)-prt{Y/X) . 



(3.21) 
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4 SEL in Algebraic Quantum Field Theory 



4.1 The story so far 

As I mentioned in Section [1], AQFT provides a more precise conception of events, 
causal influences and probabilities than the philosophical conception used in Sections 
[2] and O For us, this has two main consequences. The first is a general point about 
the broad idea of stochastic relativistic causality. Namely, one can write down several 
obviously different formulations of the idea, and go on to prove them to be logically 
independent of one another (even in the presence of other axioms of the framework). 
Besides, this sort of independence also holds good for corresponding formulations in 
heuristic quantum field theory. (For some details, cf. Horuzhy (1990, pp. 19-21); 
Butterfield (2007, Section 3) is a philosopher's introduction.) So I stress that, even 
setting aside our interest in formulations of SEL, the quantum field theory literature 
yields rich pickings about stochastic relativistic causality. 

The second consequence is specific to our topic, SEL. Because the conception of 
events etc. is more precise, fewer judgments are required about how best to "transcribe" 
SEL into AQFT than into quantum mechanics. In particular, it is pretty clear how 
best to transcribe Section r2. 3. If s formulations of SEL into AQFT; and it is clear which 
formulations are obeyed and which violated. 

This was the topic of my (1996). To summarize: I argued that the most natural 
transcription of SELS is provably satisfied. But the natural transcription of SELD2 is 
violated — on account of the outcome dependence (violations of Bell inequalities) ex- 
hibited (endemically) by quantum field quantities, shown in various papers by Landau, 
Summers, Werner et alo I need not quote these transcriptions. For our purposes, it 
is enough to say the following: 

(i) : The worlds of SELS and SELD are replaced by models of an algebraic quantum 
field theory, with each model given by a triple < M., A, ((> > where M. is the spacetime 
(in my 1996: always Minkowski spacetime), A is the assignment to each open bounded 
region O of of a local algebra A{0), and (p is the state (expectation functional). 
Accordingly, an event E is replaced by a projector in a local algebra, and E^s probabil- 
ity by the projector's expectation in 0. We think of the models as satisfying (making 
true) various sentences of a putative formal language expressing AQFT; and among 
these sentences will be ascriptions of expectation values of local projectors. 

(ii) : How should we transcribe the idea of the probability prescribed by history up 
to t, or by history within the "Table Mountain" C~{E) fl C~(t), and similar ideas? 
The most obvious tactic, viz. conditionalizing on many projectors associated with 
the region, is clearly fraught with difficulties, both mathematical and interpretative 

^''The points about SELS and SELD were prompted by a disagreement with previous literature; cf. 
Redei 1991 and MuUer and Butterfield (1994). As I mentioned in Section r2. 3. 11 there is a mnemonic: 
'S' in SELS stands for 'satisfied' as well as 'single', and 'D' in SELD stands for 'denied' as well as 
'double'. For a full survey of the results of Landau, Summers, Werner et al. upto ca. 1990, cf. 
Summers (1990). Later work includes Halvorson and Clifton (2000). 



(i.e. about value ascriptions in quantum theory): for example, are we to conditionalize 
(p on one of every pair of a projector and its orthocomplement? 

(iii) : But fortunately, we can avoid these difficulties by using the idea that mod- 
els match in their history in a spacetime region if they make true the same set of 
sentences about the region. That is, we formulate SEL as a universally quantified con- 
ditional, along the lines: 'For any two models on the same spacetime < A4,Ai,4>2 >, 
< M.,A2, 02 >5 for any region O d M., and any projector E G AiO): if the two mod- 
els match throughout a suitable region O' whose future domain of dependence -D+(0') 
includes O, O C -D^(O'), then Thus the matching on the region O' expresses the 
restriction to probabilities conditional on sufficiently rich information about the past. 

(iv) : The tactic in (iii) is common to the transcriptions of SELS and SELD2. The 
main difference between the transcriptions is then that SELS is about just the expec- 
tation oi E E A{0), so that SELS's consequent states that the two models match in 
their expectations for E, i.e. (f)i{E) = 02 (-E); while SELD2 is about the equality of E^s 
expectation, with its expectation conditional on another projector, F say, associated 
with a suitable region spacelike to O, so that SELD2's consequent states that in each 
model (pj^E) = (f)j{E/F), j = 1,2. (We need not discuss other minor differences, and 
the definition of 'suitable region'.) 

(v) : As mentioned, the main results are that, thus transcribed, SELS follows di- 
rectly from two axioms of AQFT (the Isotony and Diamond axioms); and that SELD2 
is endemically violated, thanks to the endemic Bell correlations (outcome dependence) 
shown by Landau, Summers, Werner et al. 

This verdict meshes happily with that of Section 13. 3t Bell correlations (outcome 
dependence) violate SEL in a "double" formulation, in field theory, just as in quantum 
mechanics; again providing an appropriate diagnosis of the strangeness of the Bell 
correlations. Besides, field theory's more precise framework makes the verdict more 
secure, i.e. less dependent on extraneous judgments. 

4.2 Questions 

But as in Section 13. 2[ this verdict needs to be reviewed in the light of the subsequent 
literature. I will briefiy discuss this, focussing only on questions raised by: the formula- 
tions in Section [2] (Section I4.2.ip : the work of the Budapest and Bern schools (Section 
14.2.21) . In both areas, it will be clear that there are open questions. 

4.2.1 Our formulations 

Section [2] presented formulations of SEL that applied not only to Minkowski spacetime 
but to any stably causal, in particular globally hyperbolic, spacetime. So the main tech- 
nical question raised by considering SEL for AQFT is whether the results of Landau et 
al. about endemic outcome dependence can be generalized from Minkowski spacetime 
to other spacetimes. This is a large vague question, and I shall not pursue it; but the 
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short answer is that they can be. A bit more precisely: an endemic violation of Bell 
inequalities (and so, modulo our usual assumptions: endemic outcome dependence) can 
be shown in models using globally hyperbolic spacetimes. For example, cf. Proposition 
4 of Halvorson and Clifton (2000). (For an introduction to the algebraic formulation 
of quantum field theory on curved spacetimes, and its advantages, cf. Wald (1994: pp. 
53-65, 73-85).) 

Section [2] raises two other obvious questions about my (1996) 's discussion. First, we 
can ask about Section [2. 3. If s SELDl, whose transcription to AQFT was not discussed 
in (1996); (since SELDl was not formulated in my 1994). Again, I will not pursue 
details since the situation is straightforward. One can transcribe SELDl into AQFT 
along the same lines — and as naturally — as my (1996) transcribed SELD2; and again, 
endemic outcome dependence shows that SELDl is violated. 

Besides, we saw in Section [2.4.1 . II that SELDl implies SELD2 by a short argument 
which turned just on spatiotemporal relations between the events E and F and the 
hypersurface t. Though 1 will not give details, it is straightforward to show that SELDl 
and SELD2 can be transcribed into AQFT in such a way that this argument carries 
over. So using these formulations, AQFT's violation of (the transcription of) SELDl 
will follow from its violation of (the transcription of) SELD2, by modus tollens. 

The second obvious question raised by Section [2] concerns Section I2.4.2f s proof of 
equivalence of SELS and SELD2. Since the former holds, and the latter fails, once 
they are transcribed in AQFT, we know this proof must break down in AQFT. And 
there is no mystery. Both its main assumptions, viz. 

(i) probabilities evolve by conditionalization eq. 12. 5j 

(ii) all the worlds in W have the same initial probability function pr; 
evidently fail in AQFT: i.e. once transcribed in the obvious way suggested by (i)-(iii) 
in Sectioning 



4.2.2 The Budapest and Bern schools 

I turn to three questions raised by the work of the Budapest and Bern schools. The 
first is obvious in the light of the discussion of Section [3l2] and [3l3l There we learnt: 

(i) : from the Budapest school that outcome dependence impugns primarily the 
strong PCC not the weak one; but also; 

(ii) : one could reply that the strong PCC was plausible (Section 13.2.51) . and that 
SEL reflected this by having a probability distribution prescribed by an individual 
world, or history, or region's total physical state (Section 13.3.0.31 and previous discus- 
sions cited there). 

So now the obvious question is: do (i) and (ii) carry over to AQFT? 1 think it is 
clear that Yes, they do; though again I will not enter into details. That is: 

^^Besides, the proof's two simplifying assumptions, that there are finitely many possible histories 
((a) and (b) of Section r2.4.2p may carry over less well to AQFT, than to realistic classical theories. 
But even if there are no such cardinality issues, the conceptual assumptions (i) and (ii) fail. 
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(i'): A Bell's theorem for field-theoretic quantities pertaining to spacelike regions 
would normally proceed from a strong PCC (For details, cf. eq. 3.6 - 3.8 in my (1996).) 
Here, I say 'normally', since you might ask: what about adapting the Bern school's 
theorems (Section I3.2.6P to AQFT? For this, cf. question (2) below.) 

(ii'): But again, a strong PCC is plausible; and once SELD is transcribed to AQFT 
in terms of two models matching in all the sentences about a region that they make 
true ((iii) in Section [4.ip . it reflects the strong PCC in the same way as we saw in 
Section 13.3.0.31 

The second and third questions arise from a result of Redei and Summers (2002), 
showing that in a certain sense the weak PCC is provably satisfied, not violated, by 
AQFT. More precisely, they show that: for any AQFT, i.e. assignment of local algebras 
to regions O C M. ^ -^{O), subject to standard conditions; for any spacetime regions 
Oi, O2 contained in a pair of spacelike double cones; for any state of a standard type 
(viz. locally normal and faithful); for any pair of projectors E G A{Oi), F G A{02)'- if 
E and F are correlated, i.e. 

(f){E A F) >(j){E)(j){F) (4.1) 

then there is a projector C in (the algebra for) the weak causal past of Oi and O2, i.e. 
the region (C~(Oi) — Oi) U (C~(02) — O2), consisting of points which can causally 
influence some point in either Oi or O2, that screens off the correlation in eq. 14. 1[ 
That is: 

cl>{E A F / C) = <t>{E / C)<j){F / C) . (4.2) 
This prompts two questions: 

(1) : How can this result be reconciled with the gist of our previous discussion, that 
SELD is violated? And: 

(2) : How can this result be reconciled with Hofer-Szabo's and the Bern school's 
theorems (Section 13.2.61) . that a Bell's theorem can be proved using a weak PCC? 
(There is a tension here since AQFT endemically violates Bell inequalities, so that we 
expect it to violate some assumption or other of any Bell's theorem.) 

I shall briefly state my view about the answers to these questions; but again, it will 
be obvious that there are open questions hereabouts. 

(1): I think the reconciliation lies in two contrasts between Redei and Summers' 
result, and the violation of SELD (as transcribed to AQFT). Both are obvious. The 
first relates to the distinction between the weak and strong PCC, which dominated 
Section [31 As noted in (ii') just above, SELD reflects a strong PCC. But Redei and 
Summers prove a weak PCC: for each E, F pair, there is a screener-off C@ 

Second, the two results use very different ideas of "conditionalizing on the screener- 
off" . Redei and Summers conditionalize in the usual sense (albeit adapted to quantum 

^^They of course recognize this, and see it as developing the Budapest school's critique of the Bell 
literature's conflating the weak/strong distinction. In jocular terms: they would be worried if they 
could prove a strong PCC, for fear of a Bell's theorem. 
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states) on an event (projector) C. But ever since Section l?]2l the idea of SEL, like that 
of Bell's local causality, is to consider the probability prescribed by the whole history 
of a (large) region. And as reported in (ii) and (iii) of Section 13.1^ this is transcribed 
into AQFT, not in terms of the (problematic) idea of conditioning on some sort of 
"complete" set of projectors for the region, but in terms of matching of models of the 
theory. 

A side-remark. Beware: reading Redei and Summers' theorem, you might think 
that the reconciliation lies elsewhere — in a contrast about the spatiotemporal location 
of the screener-off. For Redei and Summers' screener-off C is associated with the weak 
past of the regions Oi and O2. But SELD (transcribed to AQFT) requires matching 
of models on a region much smaller than the weak past. (In fact, in (iii) of Section 
13.11 I talked vaguely about 'a suitable region' O', saying only that its future domain 
of dependence D'^iO') must satisfy a certain condition: but the details in my (1996) 
make it clear that the region of required matching is indeed much smaller than the 
weak past.) But beware: this contrast is not the reason (or even: "a third reason") for 
the reconciliation. For as my 1996 brings out, in AQFT's violation of SELD the region 
of matching can be taken to be much larger than the weak past. In other words: AQFT 
also violates a version of SELD that is apparently weaker than my transcription, since 
it assumes models' matching on a much larger region. Accordingly, the reason for the 
reconciliation must be sought elsewhere: by my lights, in the two contrasts above. 

(2): I turn to the question why Section r3.2.6[ s Bell's theorems, assuming a weak 
PCC, fail to carry over to AQFT. (As I presume they must, on pain of their conjunc- 
tion with Redei and Summers' theorem contradicting the endemic violation of Bell 
inequalities.) Here I am less confident than for question (1), and will just endorse a 
suggestion of Portmann's (to whom my thanks). Namely: the assumption that con- 
junctions of common causes (for different correlations) are statistically independent of 
measurement choices fails in AQFT, once transcribed in the natural way with com- 
mon cause projectors C in the sense of Redei and Summers. (This is assumption 5 in 
Portmann and Wiithrich (2007), and equation (58) in Hofer-Szabo (2007). Recall that 
Szabo's model shows it is needed for a Bell's theorem.) Indeed, we would expect it to 
fail in so far as two common cause projectors Ci and C2 will in general not commute, 
and so will lack a joint quantum probability distribution]^ 



5 SEL in dynamical spacetimes 

Among the many approaches to "quantum gravity" , i.e. the reconciliation of quantum 
theory and general relativity, there are some that try to combine directly the ideas of 

•^"For the Bern school's carUcr theorem (Grasshoff et al. 2005) which assumed perfect anti- 
correlation, there may also be a reconciliation in the failure of that assumption in AQFT. For so 
far as I know, the best we can do for this assumption is along the lines that for any e > 0, there is 
a pair of (spacelike-related) projectors which are "1 - e" correlated; (Redhead, 1995, Theorem 4', p. 
133). 
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stochasticity (from quantum theory) and dynamical spacetime, i.e. the metric repre- 
senting gravity and so being responsive to matter (from general relativity). In these 
approaches, SEL and related ideas like Bell's local causality of course play a central 
role. 

I will end by indicating two examples. In Section [F!Tl I report that Kent (2005) uses 
a formulation of SEL (albeit under another name) appropriate to such a spacetime, so 
as to propose a (doable!) experiment exploring the interaction of quantum mechanics 
and general relativity (or any theory that geometrizes gravity and so has a dynamical 
spacetime). Then in Section ESI I describe how for the causal set programme, devel- 
oped by Sorkin and others, the naive transcription of SEL fails trivially — prompting 
the search for better causal-set formulations of relativistic causality. 

5.1 SEL for metric structure? 

Kent (2005a) does not mention SEL: he works with a formulation of Bell's local causal- 
ity appropriate for theories with a dynamical metric like general relativity. But it will 
be clear that his discussion could be recast in terms of our SELDl and SELD2 (cf. 
Section [2.3.2p . and their refutation by the Bell experiment (Section l3.3p . 

Kent proposes (following a suggestion of Dowker) that Bell's condition be adapted 
to a stochastic geometrized theory of gravity, as follows. Let A be a spacetime region 
equipped with specified metric and matter fields, that contains its own causal past. Let 
K be any spacetime region with specified metric and matter fields. Let pr{n/X) be the 
probability that the domain of dependence -D(A) of A is isometric to k. Let k' be any 
other region of spacetime with specified metric and matter fields, which we know to 
be spacelike from -D(A). Then we say that the envisaged theory of spacetime is locally 
causal iff for all such A, k, k', we have 



Kent now makes three points. 

(1) : General relativity is locally causal, thanks to the metric and matter fields in 
D{X) being completely determined by those in A. (Of course a wealth of mathematical 
physics, about the well-posedness of initial value problems, lies behind this point. But 
we do not need details.) 

(2) : But now imagine a standard Bell experiment, with the wings spacelike related, 
in which each detector is coupled to a nearby Cavendish experiment, so that in each 
wing each measurement choice and outcome leads to 

one of four different configurations of lead spheres — configurations which 
we know would, if the experiment were performed in isolation, produce one 
of four macroscopically and testably distinct local gravitational fields ... 
The separation of the two wings is such that the gravitational field test 
on either wing can be completed in a region space-like separated from the 
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region in which the photon on the other wing is detected ... Extrapolating 
any of the standard interpretations of quantum theory to this situation, we 
should expect to see precisely the same joint probabihties for the possible 
values of the gravitational fields in each wing's experiments as we should for 
the corresponding outcomes in the original Bell experiment ... Then, if n is 
the region immediately surrounding the measurement choice and outcome 
in one wing of the experiment, k' the corresponding region for the other 
wing, and A the past of we have 

Prob(K|A) ^ Prob(K|A,K') . 

... [Thus] we have a direct conflict between the predictions of two outstand- 
ingly successful theories: quantum theory and general relativity. (Kent 
2005a, p. 2) 

(3): Kent then considers what are the conceivable experimental outcomes. There 
are several possibilities, but I shall just quote the two obvious ones — and recommend 
Kent's discussion for more details. 

One is that the violations of local causality predicted by quantum theory, 
and to be expected if some quantum theory of gravity holds true, are indeed 
observed. This would empirically refute a key feature of general relativity, 
namely, the local causality of space-time ... [A second conceivable outcome 
is] ... that the measurement results obtained from the detectors fail to 
violate the [Bell] inequality. This would imply that quantum theory fails to 
describe correctly the results of the Bell experiment embedded within this 
particular experimental configuration, and so would imply a definite limit 
to the domain of validity of quantum theory. (Kent 2005a, p. 3). 

5.2 SEL for causal sets? 
5.2.1 The causal set approach 

I turn to the causal set programme 1^ It models spacetime as a discrete set of spacetime 
points, partially ordered by causal connectibility, which "grows" by a stochastic process 
of adding points to the future of the given discrete set. The set is called a 'causal set', 
or for short causet. In more detail: a causet is a partially ordered set (poset) C which 
is locally finite. The partial order represents causal connectibility; and 'locally finite' 
means that, writing the partial order as x -< we require that for any x,y, the set 
{z : X -< z -< y} is finite. We think of this causal structure as growing in successive 
stages. At any given stage, a new element gets added to the immediate future of some 

•^^For this Section's purposes, the main references are: Brightwell et ah (2002), Dowker (2005), and 
Rideout and Sorkin (2000). 
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of the given elements. Here 'some' (i) means, in general, not all, and (ii) allows 'none': 
i.e. the new element can be spacelike to all the given elements. Then at the next stage, 
another new element is added to (the immediate future of some of) the augmented set; 
and so on. Furthermore, this growth is stochastic: there are to be probabilistic rules for 
the various ways of adding an element. We write Q{n) for the set of n-element causets, 
i.e. posets with n elements which are a fortiori locally finite. So with C G Q{n) we 
envisage probabilistic rules for transitions: C G Q{n) — ^ C" G Q{n + 1). 

Of course, we also envisage that matter fields should be added to this framework of 
stochastic causal structure. But plenty of interesting questions, both conceptual and 
technical, can be formulated and attacked without adding a representation of matter 
to the framework. I shall present just one, viz. the specification of the probabilistic 
rules (the dynamics), since it relates to SEL. 

Besides, even without adding matter, there are three general motivations for inves- 
tigating this framework, which are worth stressing before discussing dynamics. First, 
there are various results to the effect that the causal structure of a general relativistic 
spacetime "almost determines" its metric structure (so that philosophers' traditional 
idea of a "causal theory of time" is "almost right"). These results make it reasonable 
to take as a "toy-model" of stochastic metric structure of the sort we expect quantum 
gravity to require, a stochastic causal structure. 

Second, the causet framework provides analogues of general relativistic ideas, such 
as general covariance and the definition of observables, whose bearing on quantum 
gravity is agreed to be important and problematic. So causets provide a "toy-model" 
or "conceptual laboratory" where issues about these ideas can be investigated. For 
us, the important example of this will be that causets raise good questions about how 
to formulate relativistic causality for a dynamical spacetime: in particular, the naive 
transcription of SEL fails. 

Third, a point for philosophers who (like me) endorse the "block-universe" or "B- 
theory" of time. I (together with my informants who work on causets!) take the 
causet approach to be consistent with this metaphysical view — despite its talk about 
'spacetime growing', 'spacetime points being born' etc. Indeed, there are two points 
here. The first is not specific to causets: it applies equally to classical stochastic 
process theory, or indeed indeterminism in general, and is familiar to philosophers: 
viz., these are consistent with the block- universe — despite their talk about one of many 
future alternatives 'coming to be'. The second point is that in the causet approach, 
the successive stages of a causet's growth are "gauge": they are not intended to be 
physically significant. This point is taken up in Section [5.2.21 

5.2.2 Labelled Causal Sets; General Covariance 

To define a stochastic evolution on causets, it has hitherto been indispensable to think 
in terms of successive stages, at each of which a new point added to the given causet. 
So each element/point of each causet in such a process (in the jargon of stochastic 



process theory: such a reahzation, trajectory) is labelled by a natural number n G A/", 
representing the stage at which it was added. Writing l{x) E M for the label of the 
point x, the labelling therefore obeys: x ~< y ^ l{x) < l{y)- (Of course, the converse 
implication fails since the point added later, viz. y, might be added so as to be space- 
like to X.) So each labelled causet: 

(i) : has a "birth" ("big bang"), when at stage 1, the trivial poset, the singleton set 
of one spacetime point, is "born"; and so: 

(ii) : is finite towards the past; (but we allow the stochastic process to run to infin- 
ity, so that we can consider denumerable posets) 

(iii) : determines an upward path through the poset of all finite unlabelled posets 
(ordered by inclusion in the obvious sense): so again writing Q{n) for the set of 
n-element unlabelled causets, we write this poset of all finite unlabelled causets as 
fl{Af) := U„(=^f2(n). (So all the n-element causets form rank n of Q{J\f).) 

The process of growth on labelled causets is to be non-deterministic, in the sense 
that for each labelled causet C*„ of n elements, and each subset S C C*„ that is closed 
under taking of "ancestors" (i.e. if y E S and x -< y then x E S): the next stage, 
labelled n -|- 1, could add its new element just to the future of 5"s maximal elements. 
In an obvious jargon: stage n + 1 could choose S as its newborn element's precursor 
set. 

Furthermore, the process of growth is to be stochastic in that for all Cn and for all 
such closed-under-ancestors subsets S C C*„, there is to be a corresponding transition 
probability. The set of all these transition probabilities fixes a dynamicso 

I said that to define this stochastic dynamics, it has hitherto been indispensable to 
think in terms of stages, i.e. to use labelled causets. But the labelling is obviously in 
part "gauge", i.e. lacking physical significance. (I say 'in part', because of facts like 
our example above: x ^ y => l{x) < l{y)-) Thus think of adding to the singleton set 
{x}: either 

(i) : at stage 2, a point y with x -< y; then at stage 3, a point z spacelike to both x 
and y; or 

(ii) : at stage 2, a point z spacelike to x; then at stage 3, a point y with x y but 
z spacelike to y (i.e. not z ^ y). 

Intuitively, the contrast between (i) and (ii) is without physical significance, in the 
same sort of way that the choice of coordinates in general relativity is without physical 
significance ('general covariance' or 'diffeomorphism invariance'). 

The causet approach endorses this intuition, and accordingly requires this sort of 
difference to "cancel out" in any proposed dynamics. So the idea is that the probability 
of any ■unlabelled causet C is independent of any attributed "order of birth" . This is 
made precise by imposing an assumption called 'discrete general covariance' (DGC). It 

•^^To be precise: it can be shown that a specification of all these transition probabilities fixes a 
stochastic process on the cr-algebra generated by the cylinder sets built on all the finite labelled 
causets. These measure-theoretic notions arc crucial for the causet approach's characterization of 
observables; for this topic, cf. Brightwell et al. (2002, 2003). 



says that if 7 is a path through Q{JV) from the singleton causet to an n-element causet 
C (so C is in rank n of Q{J\f)), the product of the transition probabihties along the 
links of 7 is the same as for any other such path. 

5.2.3 Deducing the dynamics 

It is a remarkable theorem (due to Rideout and Sorkin (2000)) that DGC, together 
with an assumption of relativistic causahty, constrains the stochastic dynamics very 
severely. Namely: 

(1) : At any stage n + 1, the probability, Qn say, to add a completely disconnected 
element can depend only on n, i.e. on the cardinality n of the "current" causal set or 
the "current" rank in Q{J\f). And: 

(2) : The dynamics — the set of all transition probabilities — is given explicitly by a 
formula in terms of the 

I shall report this theorem, both for its intrinsic interest, and in order to state its 
relativistic causality assumption: that will set the stage for our final return to SEL, in 
Section IQIl 

Rideout and Sorkin impose what they call 'Bell causality'. It says that the ratio of 
probabilities of two transitions depends only on the two corresponding precursor sets. 
(Recall that a precursor set is the set of "ancestors" of the newborn element.) That is: 
Let C ^ Ci, C ^ C2 be two transitions from C G Q{n) to Ci ^ Q{n + 1). Then 

prob{C ^ Ci) _ probjB ^ B^) 

prob{C^C2) prob{B ^ B2) ^ ' ' 

where 

(i) B e Q{m),m < n is the union of the precursor sets for C — > Ci and C — > C2, and 

(ii) Bi G Q{m + 1) is -B with an element added in the manner of the transition C — Cj. 

Rideout and Sorkin then show that DGC and Bell Causality imply the following 
two results, (1) and (2): — 

(1) : The probability to add a completely disconnected element can depend only on 
the cardinality n of the "current" causet, i.e. the rank in r2(A/'). 

To state result (2), we write this probability as qn, and the binomial coefficient 
"choose k from n" as Cjl. We then define the parameters 

J. yn / \n—k rin 1 

Then we have: 

(2) : Setting aside zero probabilities, an arbitrary transition probability, a„, from 
Vl{n) to VL{n + 1), with 

(i) : w := the cardinality of the transition's precursor set S 

(ii) : m := number of maximal elements in S (= number of "parents" of the new 



element) 

is given by the formula 



5.2.4 The fate of SEL 

Rideout and Sorkin's Bell causality, eq. 15.21 seems intuitively true. For its idea is 
that the prospects for a possible transition being realized should not depend on facts 
(in this framework, without matter: facts about the causal structure of spacetime) 
at spacelike separation from the transition's precursor set. But one naturally asks, 
especially after seeing formulations of SEL: surely it is weaker than it needs to be? 
Why make a statement of equality of ratios (of transition probabilities), instead of a 
stronger statement of equality of probabilities themselves — as SEL does? 

That is, one naturally suggests imposing a simpler assumption, about just one 
transition: viz. the obvious transcription to the causet approach of the idea of SEL. 
Thus let us instead require that for any transition, C — > Ci: 

prob{C Ci) = prob{B Bi) (5.4) 

where 

(i) B G f2(m), m <n is the precursor set for C — > Ci, and 

(ii) Bi G VL{m + 1) is -B with an element added in the manner of the transition C Ci. 

But Rideout and Sorkin (and subsequent causet authors) have good reason to im- 
pose only their weaker Bell causality, eq. 15.21 For eq. 15.41 has the following defect 
(Dowker, private communication). Fix a causet B and consider all the possible transi- 
tions from B: B ^ B^, i = 1,2, k. Then of course 

Sti prob{B ^ Bi) = 1. (5.5) 

Now consider any "extending" C, with (maybe many) points spacelike to B; (to be 
precise, consider any causet C containing a copy of i? as a sub-causet, yet with no 
elements of C to the future or past of any of the copy of B). For any such C, there 
are corresponding transitions C ^ Ci, and so eq. 15.41 would demand: 

S*Li pro6(C ^ Ci) = 1. (5.6) 

But eq. I5.6l is unfair to the Elsewhere (within C) of B\ For C is not obliged to "grow" 
at the current stage from its subset B that we happened to consider for our instance, 
eq. 15. 5[ of the law of total probability! 

Besides, this defect "ramifies". For it is straightforward to show (Dowker, ibid.) 
that eq. 15.41 trivializes the dynamics in that it implies that the only possible dynamics 
is either: 

(i) with probability 1, only the infinite chain grows; (here the 'infinite chain' is the 



ar 



(5.3) 



linearly ordered poset A/':=1^2^3-<...)or 

(ii) with probability 1, only the infinite anti-chain grows; (an anti-chain is the 
"completely fiat" poset, with no two elements related by -<). 

To end: this discussion prompts a philosopher's puzzle and a technical challenge. 
The puzzle (exercise for the reader!) is to say exactly how the contrast between stochas- 
tic events in a fixed spacetime, vs. stochastic events of spacetime structure, makes SEL 
in the former framework (as in Sections [2] to H]) immune to the argument just given, 
that convicted eq. 15.41 of being "unfair to the Elsewhere" . 

The technical challenge, suggested by Dowker (and which I leave as a research 
project for the reader) is to formulate, and investigate, other plausible conditions of 
relativistic causality for causets. Might one of these deserve the name 'SEL'? Work for 
the future! 



Acknowledgements: — I am grateful to audiences at the Universities of Cambridge, 
Notre Dame, Konstanz and Oxford; to Brandon Fogel, Gerd Grasshoff, Gabor Hofer- 
Szabo, Dennis Lehmkuhl, Thomas Miiller, Tomasz Placek, Samuel Portmann, Miklos 
Redei, Rafael Sorkin, Adrian Wiithrich for comments and correspondence (I only wish 
I could have acted on all suggestions!); to two referees; and to Adrian Kent and Fay 
Dowker, respectively, for teaching me the contents of Sections 15.11 and 15. 2( and espe- 
cially to Dennis Lehmkuhl for the diagrams. 



6 References 

Aspect, A. et al. (1982), 'Experimental Tests of Bell inequalities using Time- Varying 
Analyzers', Physical Review Letters, 49, pp. 1804-1807. 

Bacciagaluppi, G. (2002), 'Remarks on Spacetime and Locality in Everett's Inter- 
pretation', in Placek and Butterfield eds. (2002); pp. 105-122. 

Bell, J. (1975), The Theory of Local Beables', in Bell (2004); page references are to 
reprint. 

Bell, J. (2004) Speakable and Unspeakable in Quantum Mechanics, Cambridge Uni- 
versity Press; second edition; page references are to this edition whose pagination of 
the papers cited matches the first (1987) edition. 

Belnap, N. (2005), 'A Theory of Causation: Causae causantes (Originating causes) 
as Inus Conditions in Branching Spacetimes', British Journal for the Philosophy of 
Science 56, pp. 221-253. 

Belnap, N. and Szabo, L. (1996), 'Branching Spacetime Analysis of the GHZ The- 
orem', Foundations of Physics 26, pp. 989-1002. 



Berkovitz, J. (1998), 'Aspects of Quantum Non-locality I:', Studies in History and 
Philosophy of Modern Physics 29B, pp. 183-222. 

Berkovitz, J. (1998a), 'Aspects of Quantum Non-locality II:', Studies in History 
and Philosophy of Modern Physics 29B, pp. 509-546. 

Berkovitz. J. (2002), 'On Causal Loops in the Quantum Realm', in Placek and 
Butterfield eds. (2002); pp. 235-257. 

Berkovitz. J. (2007), 'Action at a Distance in Quantum Mechanics', Stanford Ency- 



clopedia of Philosophy. Available at: [http : / / www, seop .leeds.ac.uk/ entries / qm- action- distance / 



Brightwell, G., Dowker, H., Garcia, R., Henson, J., and Sorkin, R. (2002), 'Gen- 
eral Covariance and the "Problem of Time" in a Discrete Cosmology', in K. Bowden 
(ed.). Correlations: Proceedings of the ANPA 23 Conference, August 2001, Cambridge, 
England, pp. 1-17. Available at: arXiv:gr-qc/0202097 

Brightwell, G., Dowker, H., Garcia, R., Henson, J., and Sorkin, R. (2003), "Ob- 
servables' in Causal Set Cosmology', Physical Review D67, 084031. Available at: 



arXiv:gr-qc/0210061 



Bub, J. (1997), Interpreting the Quantum World, Cambridge: University Press. 

Butterfield, J. (1989), 'A spacetime approach to the Bell Inequality', in Philosoph- 
ical Consequences of Quantum Theory, eds. J. Gushing and E. McMullin, Notre Dame 
University Press, pp. 114-144. 

Butterfield, J. (1992), 'Bell's Theorem: what it Takes', British Journal for the 
Philosophy of Science 42, pp. 41-83. 

Butterfield, J. et al. (1993), 'Parameter Dependence in Dynamical Models of Stat- 
evector Reduction', International Journal of Theoretical Physics 32, pp. 2287-2304. 

Butterfield, J. (1994), 'Outcome Dependence and Stochastic Einstein Nonlocality' 
in Logic and Philosophy of Science in Uppsala, eds. D Prawitz and D. Westestahl pp. 
385-424. 

Butterfield, J. (1996) 'Vacuum Correlations and Outcome Dependence in Algebraic 
Quantum Field Theory, in Fundamental Aspects of Quantum Theory, eds. D. Green- 
berger and A. Zeilinger, New York Academy of Sciences, New York, pp. 768-785. 

Butterfield, J. (2007), 'Reconsidering Relativistic Causality', forthcoming in Inter- 
national Studies in the Philosophy of Science. 

Cramer, J. (1986), 'The Transactional Interpretation of Quantum Mechanics', Re- 
views of Modern Physics 58, pp. 647-687. 

Dowker, H. F. (2005), 'Causal Sets and the Deep Structure of Spacetime', Available 
at: ,arXiv:gr-qc/0508109, 

Earman, J. (2006), 'Pruning some Branches from 'Branching Spacetimes', preprint. 

Fahmi, A. and Golshani, M. (2006), 'Locality and the Greenberger-Horne- Zeilinger 
theorem'. Available at: |arXiv:quant-ph/0608049, 



Geroch, R. and Horowitz, G. (1979), 'Global structure of spacetimes', in General 
Relativity: an Einstein Centennial Survey, ed. S. Hawking and W. Israel, Cambridge 
University Press, pp. 212-293. 

Grasshoff, G. et al. (2005), 'Minimal Assumption Derivation of a Bell-type Inequal- 
ity', British Journal for the Philosophy of Science 56, pp. 663-680. 

Halvorson, H. and Clifton, R. (2000), 'Generic Bell correlation between arbitrary 
local algebras in quantum field theory'. Journal of Mathematical Physics 41, pp. 1711- 
1717; reprinted in R. Clfton, Quantum Entanglements, (eds. J. Butterfield and H. 
Halvorson), Oxford University Press, 2004. Available at: arXiv:math-ph/9909013^ 

Hawking, S. and Ellis, G. (1973), The Large-Scale Structure of Spacetime Cam- 
bridge: University Press. 

Hellman, G. (1982), 'Stochastic Einstein Locality and the Bell Theorems', Synthese 
58, pp. 461-504. 

Henson, J. (2005), 'Comparing Causality Principles', Studies in History and Phi- 
losophy of Modern Physics 36B, pp. 519-543. 

Hofer-Szabo, G. et al. (1999), 'On Reichenbach's common cause principle and 
Reichenbach's notion of common cause', British Journal for the Philosophy of Science 
50, pp. 377-399. Available at; |arXiv:quant-"ph/9805066| 

Hofer-Szabo, G. et al. (2002), 'Common-causes are not common common- causes'. 



Philosophy of Science 69, pp. 623-633. Available at; http://philsci-archive.pitt.edu/archive/00000353/ 

Hofer-Szabo, G. (2007), 'Separate vs. common-common-cause-type derivations of 
the Bell inequalities', forthcoming in Synthese. 

Kent, A. (2005), 'Causal Quantum Theory and the Collapse Locality Loophole', 
Physical Review A 72, 012107. Available at: |arXiv:quanFph/0 204104 

Kent, A. (2005a), 'A Proposed Test of the Local Causality of Spacetime'. Available 



at: arXiv:gr-qc/0507045 



Kowalski, T. and Placek, T. (1999) 'Outcomes in Branching Spacetime and GHZ- 
Bell Theorems' British Journal for the Philosophy of Science 50, pp. 349-375. 

Lewis, D. (1980) 'A Subjectivist's Guide to Objective Chance', in R.C. Jeffrey ed. 
Studies in Inductive Logic and Probability volume II University of California Press; 
reprinted in Lewis' Philosophical Papers, volume II, Oxford University Press 1986; 
page references to reprint. 

Lewis, D. (1986), On the Plurality of Worlds, Oxford: Blackwell. 

Loeve, M. (1963), Probability Theory, Princeton; van Nostrand. 

Miiller, T. (2005), 'Probability Theory and Causation: a Branching Spacetimes 
Analysis', British Journal for the Philosophy of Science 56, pp. 487-520. 

MuUer, F. and Butterfield, J. (1994), 'Is Algebraic Lorentz-covariant Quantum Field 
Theory Stochastic Einstein Local?', Philosophy of Science 61, pp. 457-474. 



Norton, J. (2003), 'Causation as Folk Science', Philosophers' Imprint 3 



http:/ /www.philosophersimprint. org/003004/ ; to be reprinted in H. Price and R. Corry, 



Causation and the Constitution of Reality, Oxford: University Press. 

Norton, J. (2006), 'Do the Causal Principles of Modern Physics Contradict Causal 
Anti-fundamentalism?', to appear in Causality: Historical and Contemporary, eds. P. 
K. Machamer and G. Wolters, Pittsburgh: University of Pittsburgh Press. Available 
at: http:/ /philsci-archive.pitt.edu/archive/00002735/^ 

Placek, T. (2000), 'Stochastic Outcomes in Branching Spacetime: Analysis of Bell's 
Theorem', British Journal for the Philosophy of Science 51 , pp. 445-475. 

Placek, T. (2000a), Is Nature Deterministic?, Jagiellonian University Press. 

Placek, T. (2002), 'Partial Indeterminism is Enough: a branching analysis of Bell- 
type inequalities', in Placek and Butterfield eds. (2002); pp. 317-342. 

Placek, T. (2004), 'Screening-off Conditions in Bell's Theorem: a Branching Space- 
times Analysis', in L. Bihounek and M. Bilkova eds., Logica Yearbook 2004, Prague: 
Filosofia, pp. 243-354. 

Placek, T. and Butterfield, J., eds. (2002), Non-locality and Modality, Dordrecht: 
Kluwer Academic (Nato Science Series, vol. 64). 

Portmann, S. and Wiithrich, A. (2007), 'Minimal assumption derivation of a weak 
Clauser-Hourne inequality', forthcoming in Studies in History and Philosophy of Mod- 
ern Physics. Available at: ,arXiv:quant-ph/0 604216 

Price, H. (1996), Time's Arrow and Archimedes' Point, Oxford: University Press. 

Redei, M. (1991), 'Bell's Inequalities, Relativistic Quantum Field Theory and the 
Problem of Hidden Variables', Philosophy of Science 58, pp. 628-638. 

Redei, M. (2002), 'Reichenbach's common cause principle and quantum correla- 
tions', in Placek and Butterfield eds. (2002); pp. 259-270. 

Redei, M. and Summers, S. (2002) 'Local Primitive Causality and the Common 
Cause Principle in Quantum Field Theory', Foundations of Physics 32 pp. 335-355. 



Available at: arXiv:quant-ph/0108023 



Redhead, M. (1987), Incompletness, Nonlocality and Realism, Oxford: University 
Press. 

Redhead, M. (1995), 'More Ado about Nothing', Foundations of Physics, 25, pp. 
123-137. 

Reichenbach, H. (1956), The Direction of Time, Berkeley: University of California 
Press. 

Rideout, D. and Sorkin, R. (2000), 'A Classical Sequential Growth Dynamics for 
Causal Sets', Physical Review D61, 024002. Available at: arXrv^gr-qc/9904062, 

Santos, E. (2005), 'Bell's Theorem and the Experiments: Increasing empirical sup- 
port for local realism?'. Studies in the History and Philosophy of Modern Physics 36B, 



pp. 544-565. 

Seevinck, M. (forthcoming), 'Deriving standard Bell inequalities from non- locality 
and its repercussions for the (im) possibility of doing experimental metaphysics'. 

Shimony, A. (2004), 'Bell's Theorem', Stanford Encyclopedia of Philosophy. Avail- 



able at: http://www.seop.leeds.ac.uk/entries/bell-theorem/ 



Socolovsky, M. (2003), 'Bell inequality, non- locality and analyticity'. Physics Letters 



A, 316, pp. 10-16. Available at: arXiv:quant-ph/0305135 



Suarez, M. (2007), 'Causal Inference in Quantum Mechanics: A Reassessment', in 
F. Russo and J. Williamson (eds.), Cauaality and Probability in the Sciences, London 
College Texts, pp. 65-106. 

Summers, S. (1990), 'On the Independence of Local Algebras in Quantum Field 
Theory', Reviews in Mathematical Physics 2, pp. 201-247. 

Szabo, L. (2000), 'On an attempt to resolve the EPR-Bell paradox via Reichen- 
bachian concept of common cause', , International Journal of Theoretical Physics 39, 
pp. 901-911. Available at: arXiv: | quant-ph/9806074| 

Timpson, C. and Brown, H. (2002), 'Entanglement and Relativity' in Understanding 
Physical Knowledge, R. Lupacchini and V. Fano (eds.), 2002; University of Bologna, 
CIUEB. Available at: |arXiv:quant-ph/0212140, and at: 



http : / / philsci- ar chi ve . pit t . edu/ ar chi ve/ 00 00 1 6 1 8 / 1 



Uffink, J. (1999), 'The Principle of the Common Cause faces the Bernstein Para- 
dox', Philosophy of Science 66, pp. S512-S525 (Supplement: Proceedings of 1998 
Conference) . 

Wald, R. (1984), General Relativity, Chicago: University of Chicago Press. 

Weihs, G. et al. (1998) 'Violation of Bell inequality under strict Einstein locality 
conditions'. Physical Review Letters 81, pp. 5039-5043. 



